{"id":6394,"date":"2023-06-23T13:53:59","date_gmt":"2023-06-23T13:53:59","guid":{"rendered":"https:\/\/www.clarin.si\/info\/?p=6394"},"modified":"2023-06-26T08:03:58","modified_gmt":"2023-06-26T08:03:58","slug":"new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers","status":"publish","type":"post","link":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/","title":{"rendered":"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers"},"content":{"rendered":"<p>We are pleased to announce that pilot versions (v0.1) of the CLASSLA-web corpora are now available within the CLASSLA Knowledge Center.\u00a0 The corpora include <a href=\"https:\/\/www.clarin.si\/ske\/#dashboard?corpname=classlaweb_hr\" target=\"_blank\" rel=\"noopener\">Croatian<\/a>\u00a0(2.3 billion words),\u00a0<a href=\"https:\/\/www.clarin.si\/ske\/#dashboard?corpname=classlaweb_sr\" target=\"_blank\" rel=\"noopener\">Serbian<\/a>\u00a0(2.4 billion words) and\u00a0<a href=\"https:\/\/www.clarin.si\/ske\/#dashboard?corpname=classlaweb_sl\" target=\"_blank\" rel=\"noopener\">Slovenian<\/a>\u00a0(1.9 billion words).<\/p>\n<p>In addition to the new corpora, a <a href=\"https:\/\/www.clarin.si\/info\/k-centre\/classla-web-bigger-and-better-web-corpora-for-croatian-serbian-and-slovenian-on-clarin-si-concordancers\/\">tutorial on the usage of CLASSLA-web corpora through the concordancers CLARIN.SI<\/a> has been published.<\/p>\n<p>You can read more about the novelties in the CLASSLA Knowledge Center below.<\/p>\n<p><!--more--><\/p>\n<hr \/>\n<p><strong>CLASSLA web corpora of Croatian, Serbian and Slovenian<\/strong><\/p>\n<p>We are\u00a0delighted to announce the release of the pilot versions (v0.1) of the CLASSLA web corpora for\u00a0<a href=\"https:\/\/www.clarin.si\/ske\/#dashboard?corpname=classlaweb_hr\" target=\"_blank\" rel=\"noopener\">Croatian<\/a>\u00a0(2.3 billion words),\u00a0<a href=\"https:\/\/www.clarin.si\/ske\/#dashboard?corpname=classlaweb_sr\" target=\"_blank\" rel=\"noopener\">Serbian<\/a>\u00a0(2.4 billion words) and\u00a0<a href=\"https:\/\/www.clarin.si\/ske\/#dashboard?corpname=classlaweb_sl\" target=\"_blank\" rel=\"noopener\">Slovenian<\/a>\u00a0(1.9 billion words). The main features of the newly released corpora, aside from their massive size and recency (crawled in 2022) is their\u00a0<a href=\"https:\/\/huggingface.co\/classla\/xlm-roberta-base-multilingual-text-genre-classifier\" target=\"_blank\" rel=\"noopener\">automatic enrichment with genre information<\/a>\u00a0and their linguistic processing with the improved\u00a0<a href=\"https:\/\/pypi.org\/project\/classla\/\" target=\"_blank\" rel=\"noopener\">CLASSLA-Stanza annotation pipeline<\/a>\u00a0(applied version to be released soon). The corpora are available for search via the CLARIN.SI concordancers,\u00a0<a href=\"https:\/\/www.clarin.si\/ske\/#open\" target=\"_blank\" rel=\"noopener\">Crystal NoSketchEngine<\/a>,\u00a0<a href=\"https:\/\/www.clarin.si\/noske\/\" target=\"_blank\" rel=\"noopener\">Bonito NoSketchEngine<\/a>\u00a0and\u00a0<a href=\"https:\/\/www.clarin.si\/kontext\/corpora\/corplist\" target=\"_blank\" rel=\"noopener\">KonText<\/a>. The pilot versions of these corpora are intended to gather valuable user feedback, while the official release (v1.0) of the three existing corpora, along with web corpora for Bosnian, Montenegrin, Macedonian, and Bulgarian, is scheduled for later this year.<\/p>\n<p>We warmly welcome you to explore our corpora. Please reach out to us at\u00a0<a href=\"mailto:helpdesk.classla@clarin.si\">helpdesk.classla@clarin.si<\/a>\u00a0with any ideas for improvements\u00a0\u2014\u00a0we will try hard to implement them in the upcoming official release already! We also encourage you to share with us how you plan to use these corpora in your research, as well as any other use cases you may have in mind.<\/p>\n<p>To give you some ideas on how the corpora can be used in your research you are invited to read\u00a0<a href=\"https:\/\/www.clarin.si\/info\/k-centre\/classla-web-bigger-and-better-web-corpora-for-croatian-serbian-and-slovenian-on-clarin-si-concordancers\/\" target=\"_blank\" rel=\"noopener\">our blog post on the use of CLASSLA web corpora via the open CLARIN.SI concordancers<\/a>. The step-by-step tutorial covers a wide range of functionalities of the concordancers, including finding collocations in different genres, analyzing word statistics, and exploring the use of non-standard words. This resource is particularly suited for linguists, language teachers and digital humanists.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We are pleased to announce that pilot versions (v0.1) of the CLASSLA-web corpora are now available within the CLASSLA Knowledge Center.\u00a0 The corpora include Croatian\u00a0(2.3 billion words),\u00a0Serbian\u00a0(2.4 billion words) and\u00a0Slovenian\u00a0(1.9 billion words). In addition to the new corpora, a tutorial on the usage of CLASSLA-web corpora through the concordancers CLARIN.SI has been published. You can [&hellip;]<\/p>\n","protected":false},"author":12,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[34],"tags":[],"class_list":["post-6394","post","type-post","status-publish","format-standard","hentry","category-events","has-post-title","has-post-date","has-post-category","has-post-tag","has-post-comment","has-post-author",""],"aioseo_notices":[],"aioseo_head":"\n\t\t<!-- All in One SEO 4.9.8 - aioseo.com -->\n\t<meta name=\"description\" content=\"We are pleased to announce that pilot versions (v0.1) of the CLASSLA-web corpora are now available within the CLASSLA Knowledge Center. The corpora include Croatian (2.3 billion words), Serbian (2.4 billion words) and Slovenian (1.9 billion words). In addition to the new corpora, a tutorial on the usage of CLASSLA-web corpora through the concordancers CLARIN.SI has been published. You can\" \/>\n\t<meta name=\"robots\" content=\"max-image-preview:large\" \/>\n\t<meta name=\"author\" content=\"Katja Meden\"\/>\n\t<meta name=\"google-site-verification\" content=\"LiA10aq97L10baWhrk27m-8KV46nP_6qo6Z8pFmPF88\" \/>\n\t<link rel=\"canonical\" href=\"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/\" \/>\n\t<meta name=\"generator\" content=\"All in One SEO (AIOSEO) 4.9.8\" \/>\n\t\t<meta property=\"og:locale\" content=\"en_GB\" \/>\n\t\t<meta property=\"og:site_name\" content=\"CLARIN Slovenija - Slovenska raziskovalna infrastruktura za jezikovne vire in tehnologije\" \/>\n\t\t<meta property=\"og:type\" content=\"article\" \/>\n\t\t<meta property=\"og:title\" content=\"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers - CLARIN Slovenija\" \/>\n\t\t<meta property=\"og:description\" content=\"We are pleased to announce that pilot versions (v0.1) of the CLASSLA-web corpora are now available within the CLASSLA Knowledge Center. The corpora include Croatian (2.3 billion words), Serbian (2.4 billion words) and Slovenian (1.9 billion words). In addition to the new corpora, a tutorial on the usage of CLASSLA-web corpora through the concordancers CLARIN.SI has been published. You can\" \/>\n\t\t<meta property=\"og:url\" content=\"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/\" \/>\n\t\t<meta property=\"article:published_time\" content=\"2023-06-23T13:53:59+00:00\" \/>\n\t\t<meta property=\"article:modified_time\" content=\"2023-06-26T08:03:58+00:00\" \/>\n\t\t<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n\t\t<meta name=\"twitter:title\" content=\"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers - CLARIN Slovenija\" \/>\n\t\t<meta name=\"twitter:description\" content=\"We are pleased to announce that pilot versions (v0.1) of the CLASSLA-web corpora are now available within the CLASSLA Knowledge Center. The corpora include Croatian (2.3 billion words), Serbian (2.4 billion words) and Slovenian (1.9 billion words). In addition to the new corpora, a tutorial on the usage of CLASSLA-web corpora through the concordancers CLARIN.SI has been published. You can\" \/>\n\t\t<script type=\"application\/ld+json\" class=\"aioseo-schema\">\n\t\t\t{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"BlogPosting\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\\\/#blogposting\",\"name\":\"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers - CLARIN Slovenija\",\"headline\":\"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers\",\"author\":{\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/author\\\/katja\\\/#author\"},\"publisher\":{\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/#organization\"},\"image\":{\"@type\":\"ImageObject\",\"url\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/wp-content\\\/uploads\\\/2014\\\/08\\\/Clarin-SI-logo.png\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/#articleImage\",\"width\":359,\"height\":150},\"datePublished\":\"2023-06-23T13:53:59+00:00\",\"dateModified\":\"2023-06-26T08:03:58+00:00\",\"inLanguage\":\"en-GB\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\\\/#webpage\"},\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\\\/#webpage\"},\"articleSection\":\"Events, English, pll_6495a38adfc3a\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\\\/#breadcrumblist\",\"itemListElement\":[{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info#listItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.clarin.si\\\/info\",\"nextItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/category\\\/events\\\/#listItem\",\"name\":\"Events\"}},{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/category\\\/events\\\/#listItem\",\"position\":2,\"name\":\"Events\",\"item\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/category\\\/events\\\/\",\"nextItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\\\/#listItem\",\"name\":\"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers\"},\"previousItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info#listItem\",\"name\":\"Home\"}},{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\\\/#listItem\",\"position\":3,\"name\":\"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers\",\"previousItem\":{\"@type\":\"ListItem\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/category\\\/events\\\/#listItem\",\"name\":\"Events\"}}]},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/#organization\",\"name\":\"CLARIN Slovenija\",\"description\":\"Slovenska raziskovalna infrastruktura za jezikovne vire in tehnologije\",\"url\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"url\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/wp-content\\\/uploads\\\/2014\\\/08\\\/Clarin-SI-logo.png\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\\\/#organizationLogo\",\"width\":359,\"height\":150},\"image\":{\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\\\/#organizationLogo\"}},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/author\\\/katja\\\/#author\",\"url\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/author\\\/katja\\\/\",\"name\":\"Katja Meden\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\\\/#webpage\",\"url\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\\\/\",\"name\":\"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers - CLARIN Slovenija\",\"description\":\"We are pleased to announce that pilot versions (v0.1) of the CLASSLA-web corpora are now available within the CLASSLA Knowledge Center. The corpora include Croatian (2.3 billion words), Serbian (2.4 billion words) and Slovenian (1.9 billion words). In addition to the new corpora, a tutorial on the usage of CLASSLA-web corpora through the concordancers CLARIN.SI has been published. You can\",\"inLanguage\":\"en-GB\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/#website\"},\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\\\/#breadcrumblist\"},\"author\":{\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/author\\\/katja\\\/#author\"},\"creator\":{\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/author\\\/katja\\\/#author\"},\"datePublished\":\"2023-06-23T13:53:59+00:00\",\"dateModified\":\"2023-06-26T08:03:58+00:00\"},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/#website\",\"url\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/\",\"name\":\"CLARIN Slovenija\",\"description\":\"Slovenska raziskovalna infrastruktura za jezikovne vire in tehnologije\",\"inLanguage\":\"en-GB\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.clarin.si\\\/info\\\/#organization\"}}]}\n\t\t<\/script>\n\t\t<!-- All in One SEO -->\n\n","aioseo_head_json":{"title":"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers - CLARIN Slovenija","description":"We are pleased to announce that pilot versions (v0.1) of the CLASSLA-web corpora are now available within the CLASSLA Knowledge Center. The corpora include Croatian (2.3 billion words), Serbian (2.4 billion words) and Slovenian (1.9 billion words). In addition to the new corpora, a tutorial on the usage of CLASSLA-web corpora through the concordancers CLARIN.SI has been published. You can","canonical_url":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/","robots":"max-image-preview:large","keywords":"","webmasterTools":{"google-site-verification":"LiA10aq97L10baWhrk27m-8KV46nP_6qo6Z8pFmPF88","miscellaneous":""},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"BlogPosting","@id":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/#blogposting","name":"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers - CLARIN Slovenija","headline":"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers","author":{"@id":"https:\/\/www.clarin.si\/info\/author\/katja\/#author"},"publisher":{"@id":"https:\/\/www.clarin.si\/info\/#organization"},"image":{"@type":"ImageObject","url":"https:\/\/www.clarin.si\/info\/wp-content\/uploads\/2014\/08\/Clarin-SI-logo.png","@id":"https:\/\/www.clarin.si\/info\/#articleImage","width":359,"height":150},"datePublished":"2023-06-23T13:53:59+00:00","dateModified":"2023-06-26T08:03:58+00:00","inLanguage":"en-GB","mainEntityOfPage":{"@id":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/#webpage"},"isPartOf":{"@id":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/#webpage"},"articleSection":"Events, English, pll_6495a38adfc3a"},{"@type":"BreadcrumbList","@id":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/#breadcrumblist","itemListElement":[{"@type":"ListItem","@id":"https:\/\/www.clarin.si\/info#listItem","position":1,"name":"Home","item":"https:\/\/www.clarin.si\/info","nextItem":{"@type":"ListItem","@id":"https:\/\/www.clarin.si\/info\/category\/events\/#listItem","name":"Events"}},{"@type":"ListItem","@id":"https:\/\/www.clarin.si\/info\/category\/events\/#listItem","position":2,"name":"Events","item":"https:\/\/www.clarin.si\/info\/category\/events\/","nextItem":{"@type":"ListItem","@id":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/#listItem","name":"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers"},"previousItem":{"@type":"ListItem","@id":"https:\/\/www.clarin.si\/info#listItem","name":"Home"}},{"@type":"ListItem","@id":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/#listItem","position":3,"name":"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers","previousItem":{"@type":"ListItem","@id":"https:\/\/www.clarin.si\/info\/category\/events\/#listItem","name":"Events"}}]},{"@type":"Organization","@id":"https:\/\/www.clarin.si\/info\/#organization","name":"CLARIN Slovenija","description":"Slovenska raziskovalna infrastruktura za jezikovne vire in tehnologije","url":"https:\/\/www.clarin.si\/info\/","logo":{"@type":"ImageObject","url":"https:\/\/www.clarin.si\/info\/wp-content\/uploads\/2014\/08\/Clarin-SI-logo.png","@id":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/#organizationLogo","width":359,"height":150},"image":{"@id":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/#organizationLogo"}},{"@type":"Person","@id":"https:\/\/www.clarin.si\/info\/author\/katja\/#author","url":"https:\/\/www.clarin.si\/info\/author\/katja\/","name":"Katja Meden"},{"@type":"WebPage","@id":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/#webpage","url":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/","name":"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers - CLARIN Slovenija","description":"We are pleased to announce that pilot versions (v0.1) of the CLASSLA-web corpora are now available within the CLASSLA Knowledge Center. The corpora include Croatian (2.3 billion words), Serbian (2.4 billion words) and Slovenian (1.9 billion words). In addition to the new corpora, a tutorial on the usage of CLASSLA-web corpora through the concordancers CLARIN.SI has been published. You can","inLanguage":"en-GB","isPartOf":{"@id":"https:\/\/www.clarin.si\/info\/#website"},"breadcrumb":{"@id":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/#breadcrumblist"},"author":{"@id":"https:\/\/www.clarin.si\/info\/author\/katja\/#author"},"creator":{"@id":"https:\/\/www.clarin.si\/info\/author\/katja\/#author"},"datePublished":"2023-06-23T13:53:59+00:00","dateModified":"2023-06-26T08:03:58+00:00"},{"@type":"WebSite","@id":"https:\/\/www.clarin.si\/info\/#website","url":"https:\/\/www.clarin.si\/info\/","name":"CLARIN Slovenija","description":"Slovenska raziskovalna infrastruktura za jezikovne vire in tehnologije","inLanguage":"en-GB","publisher":{"@id":"https:\/\/www.clarin.si\/info\/#organization"}}]},"og:locale":"en_GB","og:site_name":"CLARIN Slovenija - Slovenska raziskovalna infrastruktura za jezikovne vire in tehnologije","og:type":"article","og:title":"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers - CLARIN Slovenija","og:description":"We are pleased to announce that pilot versions (v0.1) of the CLASSLA-web corpora are now available within the CLASSLA Knowledge Center. The corpora include Croatian (2.3 billion words), Serbian (2.4 billion words) and Slovenian (1.9 billion words). In addition to the new corpora, a tutorial on the usage of CLASSLA-web corpora through the concordancers CLARIN.SI has been published. You can","og:url":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/","article:published_time":"2023-06-23T13:53:59+00:00","article:modified_time":"2023-06-26T08:03:58+00:00","twitter:card":"summary_large_image","twitter:title":"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers - CLARIN Slovenija","twitter:description":"We are pleased to announce that pilot versions (v0.1) of the CLASSLA-web corpora are now available within the CLASSLA Knowledge Center. The corpora include Croatian (2.3 billion words), Serbian (2.4 billion words) and Slovenian (1.9 billion words). In addition to the new corpora, a tutorial on the usage of CLASSLA-web corpora through the concordancers CLARIN.SI has been published. You can"},"aioseo_meta_data":{"post_id":"6394","title":null,"description":null,"keywords":null,"keyphrases":null,"primary_term":null,"canonical_url":null,"og_title":null,"og_description":null,"og_object_type":"default","og_image_type":"default","og_image_custom_url":null,"og_image_custom_fields":null,"og_image_url":null,"og_image_width":null,"og_image_height":null,"og_video":null,"og_custom_url":null,"og_article_section":null,"og_article_tags":null,"twitter_use_og":false,"twitter_card":"default","twitter_image_type":"default","twitter_image_custom_url":null,"twitter_image_custom_fields":null,"twitter_image_url":null,"twitter_title":null,"twitter_description":null,"schema_type":"default","schema_type_options":null,"schema":{"blockGraphs":[],"customGraphs":[],"default":{"data":{"Article":[],"Course":[],"Dataset":[],"FAQPage":[],"Movie":[],"Person":[],"Product":[],"ProductReview":[],"Car":[],"Recipe":[],"Service":[],"SoftwareApplication":[],"WebPage":[]},"graphName":"","isEnabled":true},"graphs":[]},"pillar_content":false,"robots_default":true,"robots_noindex":false,"robots_noarchive":false,"robots_nosnippet":false,"robots_nofollow":false,"robots_noimageindex":false,"robots_noodp":false,"robots_notranslate":false,"robots_max_snippet":null,"robots_max_videopreview":null,"robots_max_imagepreview":"large","priority":null,"frequency":null,"local_seo":null,"limit_modified_date":false,"ai":null,"breadcrumb_settings":null,"seo_analyzer_scan_date":null,"created":"2026-05-21 08:51:36","updated":"2026-05-21 08:51:36"},"aioseo_breadcrumb":"<div class=\"aioseo-breadcrumbs\"><span class=\"aioseo-breadcrumb\">\n\t\t\t<a href=\"https:\/\/www.clarin.si\/info\" title=\"Home\">Home<\/a>\n\t\t<\/span><span class=\"aioseo-breadcrumb-separator\">&raquo;<\/span><span class=\"aioseo-breadcrumb\">\n\t\t\t<a href=\"https:\/\/www.clarin.si\/info\/category\/events\/\" title=\"Events\">Events<\/a>\n\t\t<\/span><span class=\"aioseo-breadcrumb-separator\">&raquo;<\/span><span class=\"aioseo-breadcrumb\">\n\t\t\tNew CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers\n\t\t<\/span><\/div>","aioseo_breadcrumb_json":[{"label":"Home","link":"https:\/\/www.clarin.si\/info"},{"label":"Events","link":"https:\/\/www.clarin.si\/info\/category\/events\/"},{"label":"New CLASSLA web corpora and tutorial on usage of the corpora via CLARIN.SI concordancers","link":"https:\/\/www.clarin.si\/info\/new-classla-web-corpora-and-tutorial-on-usage-of-the-corpora-via-clarin-si-concordancers\/"}],"_links":{"self":[{"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/posts\/6394","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/users\/12"}],"replies":[{"embeddable":true,"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/comments?post=6394"}],"version-history":[{"count":5,"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/posts\/6394\/revisions"}],"predecessor-version":[{"id":6414,"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/posts\/6394\/revisions\/6414"}],"wp:attachment":[{"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/media?parent=6394"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/categories?post=6394"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/tags?post=6394"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}