{"id":8416,"date":"2025-10-21T10:53:52","date_gmt":"2025-10-21T10:53:52","guid":{"rendered":"https:\/\/www.clarin.si\/info\/?p=8416"},"modified":"2025-10-21T11:07:50","modified_gmt":"2025-10-21T11:07:50","slug":"parliamentary-parlacap-dataset-and-cap-topic-classifier","status":"publish","type":"post","link":"https:\/\/www.clarin.si\/info\/parliamentary-parlacap-dataset-and-cap-topic-classifier\/","title":{"rendered":"Parliamentary ParlaCAP Dataset and CAP Topic Classifier"},"content":{"rendered":"<p style=\"font-weight: 400;\">We are pleased to announce the release of the <a href=\"https:\/\/doi.org\/10.23669\/1ZTELP\">ParlaCAP dataset<\/a>: an extension of the <a href=\"https:\/\/hdl.handle.net\/11356\/2004\">ParlaMint 5.0<\/a> collection enriched with sentiment and topic annotations, as well as extended metadata on parties and democracies.<\/p>\n<p><!--more--><\/p>\n<div><span lang=\"EN-GB\">The dataset contains around 8 million speeches from 28 European parliaments, and is provided in a tabular format, enhancing the usability of the ParlaMint corpora for social and political science research. <\/span>As part of the <a href=\"https:\/\/oscars-project.eu\/projects\/parlacap-comparing-agenda-settings-across-parliaments-parlamint-dataset\">OSCARS ParlaCAP project<\/a>, the dataset was published through the Croatian CESSDA node <a href=\"https:\/\/www.crossda.hr\/\">CROSSDA<\/a>, promoting thereby collaboration between infrastructures. We also released the <a href=\"https:\/\/huggingface.co\/classla\/ParlaCAP-Topic-Classifier\">multilingual topic classifier<\/a> using the CAP (Comparative Agendas Project) labels, and <a href=\"https:\/\/github.com\/clarinsi\/ParlaCAP-Analysis-Tutorials\">tutorials for analysing ParlaCAP data in Python<\/a>. More information is available <a href=\"https:\/\/www.clarin.eu\/sites\/default\/files\/18-Bazaar-Ljubesic.pdf\">here<\/a>.<\/div>\n","protected":false},"excerpt":{"rendered":"<p>We are pleased to announce the release of the ParlaCAP dataset: an extension of the ParlaMint 5.0 collection enriched with sentiment and topic annotations, as well as extended metadata on parties and democracies.<\/p>\n","protected":false},"author":12,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[34],"tags":[],"class_list":["post-8416","post","type-post","status-publish","format-standard","hentry","category-events","has-post-title","has-post-date","has-post-category","has-post-tag","has-post-comment","has-post-author",""],"_links":{"self":[{"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/posts\/8416","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/users\/12"}],"replies":[{"embeddable":true,"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/comments?post=8416"}],"version-history":[{"count":4,"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/posts\/8416\/revisions"}],"predecessor-version":[{"id":8430,"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/posts\/8416\/revisions\/8430"}],"wp:attachment":[{"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/media?parent=8416"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/categories?post=8416"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.clarin.si\/info\/wp-json\/wp\/v2\/tags?post=8416"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}