dc.contributor.author | Belej, Primož |
dc.contributor.author | Robnik-Šikonja, Marko |
dc.contributor.author | Krek, Simon |
dc.date.accessioned | 2019-03-06T16:45:42Z |
dc.date.available | 2019-03-06T16:45:42Z |
dc.date.issued | 2019-03-02 |
dc.identifier.uri | http://hdl.handle.net/11356/1211 |
dc.description | Part-of-speech tagger for Slovene language implemented using convolutional and LSTM neural networks. Tagger uses character-level representation of sentences. The tagger has been trained on the ssj500k 2.1 corpus, http://hdl.handle.net/11356/1181. |
dc.language.iso | slv |
dc.publisher | Faculty of Computer and Information Science, University of Ljubljana |
dc.publisher | Centre for Language Resources and Technologies, University of Ljubljana |
dc.relation.isreferencedby | https://repozitorij.uni-lj.si/IzpisGradiva.php?id=105266&lang=eng |
dc.rights | GNU General Public Licence, version 3 |
dc.rights.uri | https://opensource.org/licenses/GPL-3.0 |
dc.rights.label | PUB |
dc.source.uri | https://github.com/PrimozBelej/SloTagger |
dc.subject | part-of-speech tagging |
dc.subject | neural networks |
dc.title | Character-level part-of-speech tagger of Slovene language |
dc.type | toolService |
metashare.ResourceInfo#ContentInfo.detailedType | tool |
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent | true |
has.files | yes |
branding | CLARIN.SI data & tools |
contact.person | Primož Belej pbelej90@gmail.com Faculty of Computer and Information Science, University of Ljubljana |
sponsor | University of Ljubljana I0-0022 Network of research and infrastructural centres nationalFunds |
files.count | 14 |
files.size | 147581971 |
Datoteke v tem vnosu
Prenesi vse datoteke v vnosu (140.75 MB)
- Ime
- README.md
- Velikost
- 2.89 KB
- Format
- Neznano
- Opis
- Markdown document containing brief description and examples of usage.
- MD5
- 6ce03d3464c49b5d7069a3b8fcee8661

- Ime
- characterlist
- Velikost
- 327 bajtov
- Format
- Neznano
- Opis
- List of all character that model is aware of. This list is generated during the training process and must be present in the working directory while using the tagger.
- MD5
- c31ca9602090336659e201b9acd71b71

- Ime
- config.py
- Velikost
- 38 bajtov
- Format
- Neznano
- Opis
- Configuration file.
- MD5
- 22ea18a7b4467079ae1dc4e010d1ee6d

- Ime
- en_sl_tag
- Velikost
- 27.65 KB
- Format
- Neznano
- Opis
- Pairs of English and Slovene tags from Multext east specification for Slovene language.
- MD5
- 13e8cc353a9e5a5b4474b49263dd64b7

- Ime
- model.json
- Velikost
- 14.83 KB
- Format
- Neznano
- Opis
- Keras model configuration. This file is generated during the training process.
- MD5
- 7058c188044db19fd1900e4cd667345c

- Ime
- neuralmodel.py
- Velikost
- 4.69 KB
- Format
- Neznano
- Opis
- Python library containing implementations of neural networks used in language model.
- MD5
- 44e913dc9f264609215926bcbed31a5a

- Ime
- pos_embeddings
- Velikost
- 236.46 KB
- Format
- Neznano
- Opis
- Pairs of Multext East tags and their vector space embeddings.
- MD5
- 7b90f745984463f1df090140a8174af2

- Ime
- poslib.py
- Velikost
- 11.49 KB
- Format
- Neznano
- Opis
- Python library containing implementations of POS tag transformations.
- MD5
- 0e8bd56dd114eb825b631035da759118

- Ime
- tag.py
- Velikost
- 8.47 KB
- Format
- Neznano
- Opis
- Python module with command line interface used for tagging text in plain text files and XML/TEI files. Basic examples are available in README.md.
- MD5
- f042274b8804e7811bcbce043b29940b

- Ime
- teiutils.py
- Velikost
- 2.17 KB
- Format
- Neznano
- Opis
- Python library implementing helper functions for working with XML files.
- MD5
- 3c86ae3c5b02f16bceba7aacab980a79

- Ime
- test.py
- Velikost
- 1.47 KB
- Format
- Neznano
- Opis
- Python module for testing tagging accuracy given predicted sentences and validation sentences. Module has command line interface.
- MD5
- 008b9bddfef8529392d122dd92a046b9

- Ime
- train.py
- Velikost
- 2.99 KB
- Format
- Neznano
- Opis
- Python module with command line interface for training new language models. Basic example of use is in README.md.
- MD5
- 75d3b843f37c26cfed11b3a3303eb4dc

- Ime
- txtutils.py
- Velikost
- 761 bajtov
- Format
- Neznano
- Opis
- Python library containing helper functions for working with text files.
- MD5
- 6cf6adb0961947a10145cda0373ab832

- Ime
- model_weights.h5
- Velikost
- 140.44 MB
- Format
- Neznano
- Opis
- Keras model trained on ssj500k.
- MD5
- 05337dd1a9ccfad2e90d09a3af6b9acc