Prikaži enostavni zapis vnosa
dc.contributor.author |
Jakopin, Primož |
dc.date.accessioned |
2022-07-13T07:15:15Z |
dc.date.available |
2022-07-13T07:15:15Z |
dc.date.issued |
2020-06-24 |
dc.identifier.uri |
http://hdl.handle.net/11356/1619 |
dc.description |
Nova beseda Frequency Lexicon was compiled from the Nova beseda text corpus at the Fran Ramovš Institute of Slovenian Language with hyphen characters unified and with leading and trailing non-breaking spaces deleted.
Unlike most other Slovenian corpora Nova beseda texts were pre-processed before inclusion. Typos and words with supefluous hyphens, originating from false line joinings were corrected and parts of texts in foreign, non-Slovenian language were marked-up and excluded from the lexicon.
The corpus contains 318 million tokens, mostly wordforms. It is available for search through the web page http://bos.zrc-sazu.si/a_beseda.html, where wordform search is reached by selecting "word seach" in the right-hand side "What to do?" column. On the mentioned web page the corpus structure is also explained.
See also: http://hdl.handle.net/11356/1155 |
dc.language.iso |
slv |
dc.publisher |
ZRC SAZU |
dc.subject |
monolingual lexicon |
dc.subject |
general lexicon |
dc.subject |
modern lexicon |
dc.title |
Nova beseda Frequency Lexicon (ELEXIS) |
dc.type |
lexicalConceptualResource |
metashare.ResourceInfo#ContentInfo.detailedType |
computationalLexicon |
metashare.ResourceInfo#ContentInfo.mediaType |
text |
has.files |
no |
branding |
CLARIN.SI data & tools |
demo.uri |
http://bos.zrc-sazu.si/a_beseda.html |
contact.person |
Mateja Jemec Tomazin mateja.jemec-tomazin@zrc-sazu.si ZRC SAZU Scientific Research Centre of Slovenian Academy of Sciences and Arts |
size.info |
2251151 entries |
files.count |
0 |
files.size |
0 |
Prikaži enostavni zapis vnosa