Show simple item record

 
dc.contributor.author Ljubešić, Nikola
dc.date.accessioned 2019-10-09T15:09:00Z
dc.date.available 2019-10-09T15:09:00Z
dc.date.issued 2019-10-09
dc.identifier.uri http://hdl.handle.net/11356/1251
dc.description The model for morphosyntactic annotation of standard Slovenian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the ssj500k training corpus (http://hdl.handle.net/11356/1210) and using the CLARIN.SI-embed.sl word embeddings (http://hdl.handle.net/11356/1204). The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~96.7.
dc.language.iso slv
dc.publisher Jožef Stefan Institute
dc.relation.isreplacedby http://hdl.handle.net/11356/1312
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-sa/4.0/
dc.rights.label PUB
dc.source.uri https://github.com/clarinsi/classla-stanfordnlp
dc.subject part-of-speech tagging
dc.subject language model
dc.title The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Slovenian
dc.type toolService
metashare.ResourceInfo#ContentInfo.detailedType tool
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent true
hidden hidden
has.files yes
branding CLARIN.SI data & tools
contact.person Nikola Ljubešić nikola.ljubesic@ijs.si Jožef Stefan Institute
sponsor ARRS (Slovenian Research Agency) P6-0411 Language Resources and Technologies for Slovene nationalFunds
sponsor ARRS (Slovenian Research Agency) J7-8280 FRENK: Resources, methods, and tools for the understanding, identification, and classification of various forms of socially unacceptable discourse in the information society nationalFunds
sponsor ARRS (Slovenian Research Agency) N6-0099 LiLaH: Linguistic Landscape of Hate Speech nationalFunds
files.count 2
files.size 1579974426


 Files in this item

This item is
Publicly Available
and licensed under:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Distributed under Creative Commons Attribution Required Share Alike
Icon
Name
ssj500k
Size
21.27 MB
Format
Unknown
Description
Language model
MD5
914451b6465366c598c25e33442a809e
 Download file
Icon
Name
ssj500k.pretrain.pt
Size
1.45 GB
Format
Unknown
Description
Pretrained word embeddings
MD5
a2368ff55e0ef066bfe9e702fc670e33
 Download file

Show simple item record