Show simple item record

 
dc.contributor.author Krsnik, Luka
dc.contributor.author Arhar Holdt, Špela
dc.contributor.author Čibej, Jaka
dc.contributor.author Dobrovoljc, Kaja
dc.contributor.author Ključevšek, Aleksander
dc.contributor.author Krek, Simon
dc.contributor.author Robnik-Šikonja, Marko
dc.date.accessioned 2019-03-27T12:41:31Z
dc.date.available 2019-03-27T12:41:31Z
dc.date.issued 2019-03-25
dc.identifier.uri http://hdl.handle.net/11356/1227
dc.description The LIST corpus extraction tool is a Java program for extracting lists from text corpora on the levels of characters, word parts, words, and word sets. It supports VERT and TEI P5 XML formats and outputs .CSV files that can be imported into Microsoft Excel or similar statistical processing software.
dc.language.iso slv
dc.language.iso eng
dc.publisher Centre for Language Resources and Technologies, University of Ljubljana
dc.publisher Faculty of Computer and Information Science, University of Ljubljana
dc.publisher Jožef Stefan Institute
dc.relation.isreferencedby http://www.sdjt.si/wp/wp-content/uploads/2018/09/JTDH-2018_Kljucevsek-et-al_Ucinkovit-izracun-frekvencnih-statistik-za-slovenske-jezikovne-korpuse.pdf
dc.relation.isreferencedby https://gitea.cjvt.si/lkrsnik/list
dc.relation.isreplacedby http://hdl.handle.net/11356/1276
dc.rights The MIT License (MIT)
dc.rights.uri https://opensource.org/licenses/mit-license.php
dc.rights.label PUB
dc.source.uri http://slovnica.ijs.si/
dc.subject corpus linguistics
dc.subject text processing
dc.subject extraction
dc.subject characters
dc.subject word parts
dc.subject words
dc.subject word sets
dc.subject n-grams
dc.subject morphology
dc.title Corpus extraction tool LIST 1.0
dc.type toolService
metashare.ResourceInfo#ContentInfo.detailedType tool
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent false
hidden hidden
has.files yes
branding CLARIN.SI data & tools
contact.person Jaka Čibej jaka.cibej@cjvt.si Centre for Language Resources and Technologies, University of Ljubljana
sponsor ARRS (Slovenian Research Agency) J6-8256 New grammar of contemporary standard Slovene: sources and methods nationalFunds
sponsor ARRS (Slovenian Research Agency) P6-0411 Language Resources and Technologies for Slovene nationalFunds
sponsor Jožef Stefan Institute CLARIN CLARIN.SI nationalFunds
files.count 1
files.size 17060024


 Files in this item

This item is
Publicly Available
and licensed under:
The MIT License (MIT)
Icon
Name
list1.0.zip
Size
16.27 MB
Format
application/zip
Description
Corpus Extraction Tool LIST 1.0
MD5
7b844582b683a3ad6288857930118caa
 Download file  Preview
 File Preview  
    • list1.0.jar-1 B
    • readme.md3 kB
    • run.sh-1 B
    • run.bat-1 B

Show simple item record