• Repository
  • About
  • Contact
  • CLARIN
  •  Login
  • English Slovenščina
  • CLARIN.SI repository
  • Search
  • CLARIN logo
  •   Browse  
    •    All of the Repository  
      •   Issue Date
      •   Authors
      •   Titles
      •   Subjects
      •   Publisher
      •   Language
      •   Type
      •   Rights Label
  •   My Account  
    •    Login
  •   General Information  
    •    Deposit
    •    Cite
    •    Submission Lifecycle
    •    FAQ
    •    About
    •    Help Desk
 

 
Selected Filters
 Subject : tokenisation     Clear All
Advanced Search

Filters

Use filters to refine the search results.

Current Filters:
New Filters:

Limit your search

Author  
    • Erjavec, Tomaž (17)
    • Dobrovoljc, Kaja (12)
    • Ljubešić, Nikola (10)
    • Arhar Holdt, Špela (9)
    • Čibej, Jaka (8)
    • Batanović, Vuk (7)
    • Samardžić, Tanja (7)
    • Krsnik, Luka (6)
    • Terčon, Luka (6)
    • Zupan, Katja (6)
    • Fišer, Darja (5)
    • Gantar, Polona (4)
    • Holz, Nanika (4)
    • Jezeršek, Lucija (4)
    • Kavčič, Teja (4)
    • Krek, Simon (4)
    • Kuzman, Taja (4)
    • Ledinek, Nina (4)
    • Marko, Dafne (4)
    • Miličević, Maja (4)
    • ... View More
Subject  
    • part-of-speech tagging (20)
    • manual annotation (18)
    • TEI (16)
    • named entities (15)
    • parsing (14)
    • lemmatisation (13)
    • computer-mediated communication (9)
    • dependency treebank (9)
    • word normalisation (8)
    • feature prediction (6)
    • language model (6)
    • semantic role labelling (6)
    • sentence segmentation (6)
    • CONLL-U (4)
    • corpus annotation (4)
    • dependency parsing (4)
    • verbal multiword expressions (4)
    • coreference resolution (2)
    • abbreviations (1)
    • ... View More
Language (ISO)  
    • Slovenian (16)
    • Croatian (4)
    • Serbian (4)
Type  
    • corpus (18)
    • text (18)
    • toolService (6)

Showing 1 through 10 out of 24 results

  • 1
  • 2
  • 3
  •  
  •    
    • Sort items by
    •  Relevance
    • Title Asc
    • Title Desc
    • Issue Date Asc
    • Issue Date Desc
    •  
    • Results/page
    • 5
    •  10
    • 20
    • 40
    • 60
    • 80
    • 100

  • corpus
    CLARIN.SI data & tools
    corpus
    CMC training corpus Janes-Tag 3.0
    (Jožef Stefan Institute / 2022-12-06)
    
    Author(s):
    Lenardič, Jakob ; et al.show everyone Lenardič, Jakob ; Čibej, Jaka ; Arhar Holdt, Špela ; Erjavec, Tomaž ; Fišer, Darja ; Ljubešić, Nikola ; Zupan, Katja ; Dobrovoljc, Kaja
     This item contains 2 files (8.63 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    CMC training corpus Janes-Norm 3.0
    (Jožef Stefan Institute / 2022-12-06)
    
    Author(s):
    Lenardič, Jakob ; Čibej, Jaka ; Arhar Holdt, Špela ; Erjavec, Tomaž and Fišer, Darja
     This item contains 2 files (12.16 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    Training corpus SUK 1.1
    (Centre for Language Resources and Technologies, University of Ljubljana / 2024-08-22)
    
    Author(s):
    Arhar Holdt, Špela ; et al.show everyone Arhar Holdt, Špela ; Krek, Simon ; Dobrovoljc, Kaja ; Erjavec, Tomaž ; Gantar, Polona ; Čibej, Jaka ; Pori, Eva ; Terčon, Luka ; Munda, Tina ; Žitnik, Slavko ; Robida, Nejc ; Blagus, Neli ; Može, Sara ; Ledinek, Nina ; Holz, Nanika ; Zupan, Katja ; Kuzman, Taja ; Kavčič, Teja ; Škrjanec, Iza ; Marko, Dafne ; Jezeršek, Lucija ; Zajc, Anja
     This item contains 2 files (45.1 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • toolService
    CLARIN.SI data & tools
    toolService
    The Trankit model for linguistic process of standard written Slovenian 1.1
    (Centre for Language Resources and Technologies, University of Ljubljana / 2024-08-29)
    
    Author(s):
    Krsnik, Luka ; Dobrovoljc, Kaja and Terčon, Luka
     This item contains 1 file (143.34 MB).
     
    Publicly Available

  • corpus
    CLARIN.SI data & tools
    corpus
    Croatian linguistic training corpus hr500k 2.0
    (Jožef Stefan Institute / 2023-04-13)
    
    Author(s):
    Ljubešić, Nikola and Samardžić, Tanja
     This item contains 7 files (49.59 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    Croatian Twitter training corpus ReLDI-NormTagNER-hr 3.0
    (Jožef Stefan Institute / 2023-04-07)
    
    Author(s):
    Ljubešić, Nikola ; Erjavec, Tomaž ; Batanović, Vuk ; Miličević, Maja and Samardžić, Tanja
     This item contains 4 files (8.54 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    Serbian linguistic training corpus SETimes.SR 2.0
    (Regional Linguistic Data Initiative Centre ReLDI; Jožef Stefan Institute / 2023-06-13)
    
    Author(s):
    Batanović, Vuk ; Ljubešić, Nikola ; Samardžić, Tanja and Erjavec, Tomaž
     This item contains 4 files (9.4 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • toolService
    CLARIN.SI data & tools
    toolService
    The Trankit model for linguistic processing of written and spoken Slovenian 1.2
    (Centre for Language Resources and Technologies, University of Ljubljana / 2024-12-06)
    
    Author(s):
    Krsnik, Luka ; Dobrovoljc, Kaja and Terčon, Luka
     This item contains 1 file (145.51 MB).
     
    Publicly Available

  • toolService
    CLARIN.SI data & tools
    toolService
    Trankit model for SST 2.15 1.1
    (Centre for Language Resources and Technologies, University of Ljubljana / 2024-12-06)
    
    Author(s):
    Krsnik, Luka ; Dobrovoljc, Kaja and Terčon, Luka
     This item contains 1 file (138.81 MB).
     
    Publicly Available

  • corpus
    CLARIN.SI data & tools
    corpus
    Serbian Twitter training corpus ReLDI-NormTagNER-sr 3.0
    (Jožef Stefan Institute / 2023-04-07)
    
    Author(s):
    Ljubešić, Nikola ; Erjavec, Tomaž ; Batanović, Vuk ; Miličević, Maja and Samardžić, Tanja
     This item contains 4 files (8.81 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • 1
  • 2
  • 3
  •  
  •    
    • Sort items by
    •  Relevance
    • Title Asc
    • Title Desc
    • Issue Date Asc
    • Issue Date Desc
    •  
    • Results/page
    • 5
    •  10
    • 20
    • 40
    • 60
    • 80
    • 100
 

Partners

  • Alpineon, d.o.o.
  • Amebis, d.o.o.
  • Institute of Contemporary History
  • Jožef Stefan Institute
  • National and University Library of Slovenia
  • Slovenian Language Technologies Society

Partners

  • University of Ljubljana
  • University of Maribor
  • University of Nova Gorica
  • University of Primorska
  • ZRC SAZU
  • ZRS Koper

Repository

  • Main page
  • Contact
  • Submission Lifecycle
  • FAQ
  • About and Policies

This platform runs under the software developed for the LINDAT/CLARIAH-CZ repository for linguistics, available on GitHub

CLARIN.SI is supported by the Ministry of Education, Science and Sport of the Republic of Slovenia
under the Programme of "Research Infrastructures".