Prikaži enostavni zapis vnosa
dc.contributor.author |
Lebar Bajec, Iztok |
dc.contributor.author |
Repar, Andraž |
dc.contributor.author |
Demšar, Jure |
dc.contributor.author |
Bajec, Žan |
dc.contributor.author |
Rizvič, Mitja |
dc.contributor.author |
Kumperščak, Borut |
dc.contributor.author |
Bajec, Marko |
dc.date.accessioned |
2022-12-02T10:46:08Z |
dc.date.available |
2022-12-02T10:46:08Z |
dc.date.issued |
2022-12-01 |
dc.identifier.uri |
http://hdl.handle.net/11356/1736 |
dc.description |
This Neural Machine Translation model for Slovene-English language pair was trained following the NVIDIA NeMo NMT AAYN recipe (for details see the official NVIDIA NeMo NMT documentation, https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/machine_translation/machine_translation.html, and NVIDIA NeMo GitHub repository https://github.com/NVIDIA/NeMo). It provides functionality for translating text written in Slovene language to English and vice versa.
The training corpus was built from publicly available datasets, including Parallel corpus EN-SL RSDO4 1.0 (https://www.clarin.si/repository/xmlui/handle/11356/1457), as well as a small portion of proprietary data. In total the training corpus consisted of 32.638.758 translation pairs and the validation corpus consisted of 8.163 translation pairs. The model was trained on 64GPUs and on the validation corpus reached a SacreBleu score of 48.3191 (at epoch 37) for translation from Slovene to English and a SacreBleu score of 53.8191 (at epoch 47) for translation from English to Slovene. |
dc.language.iso |
slv |
dc.language.iso |
eng |
dc.publisher |
Faculty of Computer and Information Science, University of Ljubljana |
dc.relation.isreferencedby |
https://github.com/clarinsi/Slovene_NMT |
dc.rights |
Apache License 2.0 |
dc.rights.uri |
https://opensource.org/licenses/Apache-2.0 |
dc.rights.label |
PUB |
dc.source.uri |
https://rsdo.slovenscina.eu/en/machine-translation |
dc.subject |
machine translation |
dc.subject |
NeMo |
dc.subject |
model |
dc.title |
Neural Machine Translation model for Slovene-English language pair RSDO-DS4-NMT 1.2.6 |
dc.type |
toolService |
metashare.ResourceInfo#ContentInfo.detailedType |
tool |
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent |
true |
has.files |
yes |
branding |
CLARIN.SI data & tools |
demo.uri |
https://www.slovenscina.eu/en/prevajalnik |
contact.person |
Iztok Lebar Bajec ilb@fri.uni-lj.si Faculty of Computer and Information Science, University of Ljubljana |
sponsor |
Ministry of Culture C3340-20-278001 Development of Slovene in a Digital Environment Other |
files.count |
2 |
files.size |
3968004854 |
Datoteke v tem vnosu
To je vnos
Publicly Available
z licenco:
Apache License 2.0
- Ime
- slen_GEN_nemo-1.2.6.tar.zst
- Velikost
- 1.85
GB
- Format
- Neznano
- Opis
- RSDO DS4 NMT SLEN 1.2.6
- MD5
- e8ccb661e27aa3469b7b943a928282f2
Prenesi datoteko
- Ime
- ensl_GEN_nemo-1.2.6.tar.zst
- Velikost
- 1.85
GB
- Format
- Neznano
- Opis
- RSDO DS4 NMT ENSL 1.2.6
- MD5
- ea697f3fbc2f8ccb22c594c74b4a1cfe
Prenesi datoteko
Prikaži enostavni zapis vnosa