Nova beseda Frequency Lexicon (ELEXIS)

Jakopin, Primož

dc.contributor.author	Jakopin, Primož
dc.date.accessioned	2022-07-13T07:15:15Z
dc.date.available	2022-07-13T07:15:15Z
dc.date.issued	2020-06-24
dc.identifier.uri	http://hdl.handle.net/11356/1619
dc.description	Nova beseda Frequency Lexicon was compiled from the Nova beseda text corpus at the Fran Ramovš Institute of Slovenian Language with hyphen characters unified and with leading and trailing non-breaking spaces deleted. Unlike most other Slovenian corpora Nova beseda texts were pre-processed before inclusion. Typos and words with supefluous hyphens, originating from false line joinings were corrected and parts of texts in foreign, non-Slovenian language were marked-up and excluded from the lexicon. The corpus contains 318 million tokens, mostly wordforms. It is available for search through the web page http://bos.zrc-sazu.si/a_beseda.html, where wordform search is reached by selecting "word seach" in the right-hand side "What to do?" column. On the mentioned web page the corpus structure is also explained. See also: http://hdl.handle.net/11356/1155
dc.language.iso	slv
dc.publisher	ZRC SAZU
dc.subject	monolingual lexicon
dc.subject	general lexicon
dc.subject	modern lexicon
dc.title	Nova beseda Frequency Lexicon (ELEXIS)
dc.type	lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.detailedType	computationalLexicon
metashare.ResourceInfo#ContentInfo.mediaType	text
has.files	no
branding	CLARIN.SI data & tools
demo.uri	http://bos.zrc-sazu.si/a_beseda.html
contact.person	Mateja Jemec Tomazin mateja.jemec-tomazin@zrc-sazu.si ZRC SAZU Scientific Research Centre of Slovenian Academy of Sciences and Arts
size.info	2251151 entries
files.count	0
files.size	0

Show simple item record

Partners

Partners

Repository