ELMo embeddings model, Slovenian

Name: ELMo embeddings model, Slovenian
License: https://opensource.org/licenses/Apache-2.0

Ulčar, Matej

dc.contributor.author	Ulčar, Matej
dc.date.accessioned	2019-10-15T09:10:40Z
dc.date.available	2019-10-15T09:10:40Z
dc.date.issued	2019-10-15
dc.identifier.uri	http://hdl.handle.net/11356/1257
dc.description	ELMo language model (https://github.com/allenai/bilm-tf) used to produce contextual word embeddings, trained on entire Gigafida 2.0 corpus (https://viri.cjvt.si/gigafida/System/Impressum) for 10 epochs. 1,364,064 most common tokens were provided as vocabulary during the training. The model can also infer OOV words, since the neural network input is on the character level.
dc.language.iso	slv
dc.publisher	Faculty of Computer and Information Science, University of Ljubljana
dc.relation	info:eu-repo/grantAgreement/EC/H2020/825153
dc.relation.isreplacedby	http://hdl.handle.net/11356/1277
dc.rights	Apache License 2.0
dc.rights.uri	https://opensource.org/licenses/Apache-2.0
dc.rights.label	PUB
dc.source.uri	http://embeddia.eu
dc.subject	ELMo
dc.subject	contextual embeddings
dc.subject	word embeddings
dc.title	ELMo embeddings model, Slovenian
dc.type	toolService
metashare.ResourceInfo#ContentInfo.detailedType	other
metashare.ResourceInfo#ContentInfo.mediaType	text
has.files	yes
branding	CLARIN.SI data & tools
contact.person	Matej Ulčar matej.ulcar@fri.uni-lj.si Faculty of Computer and Information Science, University of Ljubljana
sponsor	European Union EC/H2020/825153 EMBEDDIA - Cross-Lingual Embeddings for Less-Represented Languages in European News Media euFunds info:eu-repo/grantAgreement/EC/H2020/825153
size.info	213 mb
size.info	1364064 tokens
files.count	2
files.size	223309306

Files in this item

Download all files in item (212.96 MB)

This item is

Publicly Available

and licensed under:
Apache License 2.0

Name: gigafida_weights.hdf5
Size: 212.96 MB
Format: Unknown
Description: pytorch weights of the model
MD5: e312065555a9cc3d55b6a4c86138b498

Download file

Name: options.json
Size: 546 bytes
Format: Unknown
Description: options file used for training and inference
MD5: c6212def0ff5cf2feb1054b475b7e8c0

Download file

Show simple item record

Files in this item

Partners

Partners

Repository