Show simple item record

 
dc.contributor.author Terčon, Luka
dc.contributor.author Dobrovoljc, Kaja
dc.contributor.author Ljubešić, Nikola
dc.date.accessioned 2025-02-09T11:44:20Z
dc.date.available 2025-02-09T11:44:20Z
dc.date.issued 2025-02-07
dc.identifier.uri http://hdl.handle.net/11356/2015
dc.description This model for UD dependency parsing of standard Slovenian was built with the CLASSLA-Stanza tool (https://github.com/clarinsi/classla) by training on the SUK training corpus (http://hdl.handle.net/11356/1747) and using the CLARIN.SI-embed.sl word embeddings (http://hdl.handle.net/11356/1204) expanded with the MaCoCu-sl Slovene web corpus (http://hdl.handle.net/11356/1517). The estimated LAS of the parser is ~90.42. The difference to the previous version of the model is that the model was trained using the improved SUK 1.1 version of the training corpus.
dc.language.iso slv
dc.publisher Jožef Stefan Institute
dc.relation.isreferencedby https://doi.org/10.5281/zenodo.13936406
dc.relation.replaces http://hdl.handle.net/11356/1769
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-sa/4.0/
dc.rights.label PUB
dc.source.uri https://github.com/clarinsi/classla
dc.subject parsing
dc.subject language model
dc.title The CLASSLA-Stanza model for UD dependency parsing of standard Slovenian 2.2
dc.type toolService
metashare.ResourceInfo#ContentInfo.detailedType tool
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent true
has.files yes
branding CLARIN.SI data & tools
contact.person Luka Terčon luka.tercon@fri.uni-lj.si Faculty of Computer and Information Science, University of Ljubljana
contact.person Nikola Ljubešić nikola.ljubesic@ijs.si Jožef Stefan Institute
sponsor ARRS (Slovenian Research Agency) P6-0411 Language Resources and Technologies for Slovene nationalFunds
sponsor Ministry of Culture C3340-20-278001 Development of Slovene in a Digital Environment Other
sponsor ARRS (Slovenian Research Agency) J7-4642 MEZZANINE nationalFunds
sponsor ARRS (Slovenian Research Agency) Z6-4617 Treebank-Driven Approach to the Study of Spoken Slovenian nationalFunds
files.count 2
files.size 201556352


 Files in this item

 Download all files in item (192.22 MB)
This item is
Publicly Available
and licensed under:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Distributed under Creative Commons Attribution Required Share Alike
Icon
Name
depparse.zip
Size
86.94 MB
Format
application/zip
Description
Language model
MD5
6440136e448f4d1c91de583e3ee4e5b4
 Download file  Preview
 File Preview  
    • depparse-1 B
Icon
Name
sl_ssj.pretrain.zip
Size
105.27 MB
Format
application/zip
Description
Pretrained word embeddings
MD5
653cfb0ad1eb2accb2f50ae22908b474
 Download file  Preview
 File Preview  
    • sl_ssj.pretrain.pt-1 B

Show simple item record