Show simple item record

 
dc.contributor.author Ljubešić, Nikola
dc.contributor.author Osenova, Petya
dc.contributor.author Simov, Kiril
dc.date.accessioned 2020-06-29T11:09:22Z
dc.date.available 2020-06-29T11:09:22Z
dc.date.issued 2020-06-24
dc.identifier.uri http://hdl.handle.net/11356/1326
dc.description This model for morphosyntactic annotation of standard Bulgarian was built with the CLASSLA-StanfordNLP tool (https://github.com/clarinsi/classla-stanfordnlp) by training on the BulTreeBank training corpus (http://hdl.handle.net/11495/D93F-C6E9-65D9-2) and using the CoNLL2017 word embeddings (http://hdl.handle.net/11234/1-1989). The model produces simultaneously UPOS, FEATS and XPOS (MULTEXT-East) labels. The estimated F1 of the XPOS annotations is ~96.8.
dc.language.iso bul
dc.publisher Jožef Stefan Institute
dc.publisher IICT-BAS
dc.relation.isreferencedby http://dx.doi.org/10.18653/v1/W19-3704
dc.relation.isreplacedby http://hdl.handle.net/11356/1394
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-sa/4.0/
dc.rights.label PUB
dc.source.uri https://github.com/clarinsi/classla-stanfordnlp
dc.subject part-of-speech tagging
dc.subject language model
dc.title The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Bulgarian 1.0
dc.type toolService
metashare.ResourceInfo#ContentInfo.detailedType tool
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent true
hidden hidden
has.files yes
branding CLARIN.SI data & tools
contact.person Nikola Ljubešić nikola.ljubesic@ijs.si Jožef Stefan Institute
sponsor ARRS (Slovenian Research Agency) P6-0411 Language Resources and Technologies for Slovene nationalFunds
sponsor Ministry of Education and Science Republic of Bulgaria DO01-272/16.12.2019 Bulgarian National Interdisciplinary Research e-Infrastructure for Resources and Technologies CLaDA-BG nationalFunds
files.count 2
files.size 466059035


 Files in this item

 Download all files in item (444.47 MB)
This item is
Publicly Available
and licensed under:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Distributed under Creative Commons Attribution Required Share Alike
Icon
Name
BTB
Size
62.21 MB
Format
Unknown
Description
Language model
MD5
4c208db4c106990c570004c07bb13be2
 Download file
Icon
Name
BTB.pretrain.pt
Size
382.25 MB
Format
Unknown
Description
Pretrained word embeddings
MD5
54093d4962f7dbe48aba1dbddc836ffa
 Download file

Show simple item record