• Repository
  • About
  • Contact
  • CLARIN
  •  Login
  • English Slovenščina
  • CLARIN.SI repository
  • Search
  • CLARIN logo
  •   Browse  
    •    All of the Repository  
      •   Issue Date
      •   Authors
      •   Titles
      •   Subjects
      •   Publisher
      •   Language
      •   Type
      •   Rights Label
  •   My Account  
    •    Login
  •   General Information  
    •    Deposit
    •    Cite
    •    Submission Lifecycle
    •    FAQ
    •    About
    •    Help Desk
 

 
Selected Filters
 Subject : lemmatisation     Clear All
Advanced Search

Filters

Use filters to refine the search results.

Current Filters:
New Filters:

Limit your search

Author  
    • Ljubešić, Nikola (23)
    • Erjavec, Tomaž (19)
    • Dobrovoljc, Kaja (14)
    • Terčon, Luka (13)
    • Krsnik, Luka (8)
    • Čibej, Jaka (6)
    • Arhar Holdt, Špela (5)
    • Holozan, Peter (5)
    • Krek, Simon (5)
    • Batanović, Vuk (4)
    • Miličević, Maja (4)
    • Samardžić, Tanja (4)
    • Štefanec, Vanja (4)
    • Robnik-Šikonja, Marko (3)
    • Romih, Miro (3)
    • Derzhanski, Ivan (2)
    • Fišer, Darja (2)
    • Kotsyba, Natalia (2)
    • Lenardič, Jakob (2)
    • Simov, Kiril (2)
    • ... View More
Subject  
    • part-of-speech tagging (21)
    • language model (16)
    • tokenisation (13)
    • computer-mediated communication (11)
    • manual annotation (11)
    • TEI (10)
    • named entities (7)
    • feature prediction (6)
    • inflection (6)
    • parsing (6)
    • sentence segmentation (6)
    • word normalisation (6)
    • morphology (5)
    • corpus annotation (4)
    • dependency parsing (4)
    • word embeddings (4)
    • word forms (4)
    • derivation (3)
    • historical language (3)
    • ... View More
Rights  
    • PUB (41)
    • ACA (1)
Language (ISO)  
    • Slovenian (23)
    • Serbian (9)
    • Croatian (6)
    • Macedonian (4)
    • Bulgarian (3)
    • Estonian (2)
    • Russian (2)
    • Czech (1)
    • English (1)
    • French (1)
    • Hungarian (1)
    • Persian (1)
    • Polish (1)
    • Romanian (1)
    • Slovak (1)
    • Ukrainian (1)
Type  
    • text (26)
    • toolService (17)
    • corpus (16)
    • lexicalConceptualResource (10)
Contain Files  
    • yes (42)
    • no (1)

Showing 1 through 40 out of 43 results

  • 1
  • 2
  •  
  •    
    • Sort items by
    • Relevance
    • Title Asc
    • Title Desc
    • Issue Date Asc
    • Issue Date Desc
    •  
    • Results/page
    • 5
    • 10
    • 20
    •  40
    • 60
    • 80
    • 100

  • corpus
    CLARIN.SI data & tools
    corpus
    Ekspress news article archive (in Estonian and Russian) 1.0
    (Ekspress Meedia Group / 2021-04-19)
    
    Author(s):
    Purver, Matthew ; et al.show everyone Purver, Matthew ; Pollak, Senja ; Freienthal, Linda ; Kuulmets, Hele-Andra ; Krustok, Ivar ; Shekhar, Ravi
     This item contains 6 files (2.32 GB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Noncommercial No Derivative Works

  • lexicalConceptualResource
    CLARIN.SI data & tools
    lexicalConceptualResource
    MULTEXT-East non-commercial lexicons 4.0
    (Jožef Stefan Institute / 2010-05-14)
    
    Author(s):
    Erjavec, Tomaž ; et al.show everyone Erjavec, Tomaž ; Derzhanski, Ivan ; Divjak, Dagmar ; Feldman, Anna ; Kopotev, Mikhail ; Kotsyba, Natalia ; Krstev, Cvetana ; Petrovski, Aleksandar ; QasemiZadeh, Behrang ; Radziszewski, Adam ; Sharoff, Serge ; Sokolovsky, Paul ; Vitas, Duško ; Zdravkova, Katerina
     This item contains 6 files (12.05 MB).
     
    Academic Use Attribution Required Noncommercial

  • lexicalConceptualResource
    CLARIN.SI data & tools
    lexicalConceptualResource
    Morphological lexicon Sloleks 3.0
    (Centre for Language Resources and Technologies, University of Ljubljana / 2022-12-05)
    
    Author(s):
    Čibej, Jaka ; et al.show everyone Čibej, Jaka ; Gantar, Kaja ; Dobrovoljc, Kaja ; Krek, Simon ; Holozan, Peter ; Erjavec, Tomaž ; Romih, Miro ; Arhar Holdt, Špela ; Krsnik, Luka ; Robnik-Šikonja, Marko
     This item contains 1 file (239.75 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • toolService
    CLARIN.SI data & tools
    toolService
    The CLASSLA-Stanza model for lemmatisation of standard Slovenian 2.0
    (Jožef Stefan Institute / 2023-01-31)
    
    Author(s):
    Terčon, Luka ; Čibej, Jaka and Ljubešić, Nikola
     This item contains 1 file (2.09 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • toolService
    CLARIN.SI data & tools
    toolService
    The CLASSLA-StanfordNLP model for lemmatisation of non-standard Serbian 1.1
    (Jožef Stefan Institute / 2020-09-15)
    
    Author(s):
    Ljubešić, Nikola and Štefanec, Vanja
     This item contains 1 file (90.05 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • toolService
    CLARIN.SI data & tools
    toolService
    The CLASSLA-StanfordNLP model for lemmatisation of non-standard Croatian 1.1
    (Jožef Stefan Institute / 2020-07-17)
    
    Author(s):
    Ljubešić, Nikola and Štefanec, Vanja
     This item contains 1 file (89.98 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • lexicalConceptualResource
    CLARIN.SI data & tools
    lexicalConceptualResource
    Beseda Corpus Lemmatisation Lexicon
    (ZRC SAZU / 2017)
    
    Author(s):
    Jakopin, Primož
     This item contains 1 file (10.56 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required

  • corpus
    CLARIN.SI data & tools
    corpus
    Annotated Corpus of Pre-Standardized Balkan Slavic Literature 1.1
    (Slavic Seminary, University of Zurich / 2021-07-02)
    
    Author(s):
    Šimko, Ivan
     This item contains 5 files (3.58 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • toolService
    CLARIN.SI data & tools
    toolService
    The CLASSLA-Stanza model for lemmatisation of non-standard Slovenian 2.1
    (Jožef Stefan Institute / 2023-03-30)
    
    Author(s):
    Terčon, Luka and Ljubešić, Nikola
     This item contains 1 file (2.35 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • lexicalConceptualResource
    CLARIN.SI data & tools
    lexicalConceptualResource
    Word embeddings CLARIN.SI-embed.sr 1.0
    (Jožef Stefan Institute / 2018-12-10)
    
    Author(s):
    Ljubešić, Nikola
     This item contains 4 files (3.36 GB).
     
    Publicly Available Distributed under Creative Commons Attribution Required

  • lexicalConceptualResource
    CLARIN.SI data & tools
    lexicalConceptualResource
    Word embeddings CLARIN.SI-embed.sl 1.0
    (Jožef Stefan Institute / 2018-11-26)
    
    Author(s):
    Ljubešić, Nikola and Erjavec, Tomaž
     This item contains 4 files (6.41 GB).
     
    Publicly Available Distributed under Creative Commons Attribution Required

  • lexicalConceptualResource
    CLARIN.SI data & tools
    lexicalConceptualResource
    Word embeddings CLARIN.SI-embed.hr 1.0
    (Jožef Stefan Institute / 2018-12-10)
    
    Author(s):
    Ljubešić, Nikola
     This item contains 4 files (4.88 GB).
     
    Publicly Available Distributed under Creative Commons Attribution Required

  • corpus
    CLARIN.SI data & tools
    corpus
    Annotated sample of the Slovenian Biographical Lexicon SBL-51abbr 1.0
    (Slovenian Academy of Sciences and Arts / 2022-06-15)
    
    Author(s):
    Erjavec, Tomaž ; Vide Ogrin, Petra ; Lenardič, Jakob ; Mlinar Strgar, Mojca and Frankl, Simona
     This item contains 3 files (1.09 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required

  • toolService
    CLARIN.SI data & tools
    toolService
    The CLASSLA-Stanza model for lemmatisation of standard Croatian 2.1
    (Jožef Stefan Institute / 2023-05-10)
    
    Author(s):
    Terčon, Luka and Ljubešić, Nikola
     This item contains 1 file (98.13 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • toolService
    CLARIN.SI data & tools
    toolService
    The CLASSLA-Stanza model for lemmatisation of non-standard Serbian 2.1
    (Jožef Stefan Institute / 2023-05-10)
    
    Author(s):
    Terčon, Luka ; Ljubešić, Nikola and Štefanec, Vanja
     This item contains 1 file (104.93 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • toolService
    CLARIN.SI data & tools
    toolService
    The CLASSLA-Stanza model for lemmatisation of standard Serbian 2.1
    (Jožef Stefan Institute / 2023-05-10)
    
    Author(s):
    Terčon, Luka and Ljubešić, Nikola
     This item contains 1 file (104.93 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • toolService
    CLARIN.SI data & tools
    toolService
    The CLASSLA-Stanza model for lemmatisation of non-standard Croatian 2.1
    (Jožef Stefan Institute / 2023-05-10)
    
    Author(s):
    Terčon, Luka ; Ljubešić, Nikola and Štefanec, Vanja
     This item contains 1 file (98.12 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • lexicalConceptualResource
    CLARIN.SI data & tools
    lexicalConceptualResource
    MULTEXT-East free lexicons 4.0
    (Jožef Stefan Institute / 2010-05-14)
    
    Author(s):
    Erjavec, Tomaž ; et al.show everyone Erjavec, Tomaž ; Bruda, Ştefan ; Derzhanski, Ivan ; Dimitrova, Ludmila ; Garabík, Radovan ; Holozan, Peter ; Ide, Nancy ; Kaalep, Heiki-Jaan ; Kotsyba, Natalia ; Oravecz, Csaba ; Petkevič, Vladimír ; Priest-Dorman, Greg ; Shevchenko, Igor ; Simov, Kiril ; Sinapova, Lydia ; Steenwijk, Han ; Tihanyi, Laszlo ; Tufiş, Dan ; Véronis, Jean
     This item contains 12 files (16.27 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • lexicalConceptualResource
    CLARIN.SI data & tools
    lexicalConceptualResource
    Lexicon of historical Slovene imp25k 1.1
    (Jožef Stefan Institute / 2014-09-13)
    
    Author(s):
    Erjavec, Tomaž
     This item contains 2 files (25.42 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required

  • toolService
    CLARIN.SI data & tools
    toolService
    The CLASSLA-Stanza model for lemmatisation of standard Macedonian 2.1
    (Jožef Stefan Institute / 2023-06-27)
    
    Author(s):
    Terčon, Luka ; Ljubešić, Nikola ; Zdravkova, Katerina and Erjavec, Tomaž
     This item contains 1 file (2.19 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • toolService
    CLARIN.SI data & tools
    toolService
    The CLASSLA-Stanza model for lemmatisation of standard Bulgarian 2.1
    (Jožef Stefan Institute; IICT-BAS / 2023-06-27)
    
    Author(s):
    Terčon, Luka ; Ljubešić, Nikola ; Osenova, Petya and Simov, Kiril
     This item contains 1 file (52.95 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    Macedonian linguistic training corpus SETimes.MK 0.1
    (Jožef Stefan Institute / 2023-12-20)
    
    Author(s):
    Ljubešić, Nikola and Stojanovska, Biljana
     This item contains 1 file (1.1 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    Serbian web corpus srWaC 1.1
    (Jožef Stefan Institute / 2016-05-12)
    
    Author(s):
    Ljubešić, Nikola and Klubička, Filip
     This item contains 6 files (3.51 GB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    CMC training corpus Janes-Tag 2.1
    (Jožef Stefan Institute / 2019-09-11)
    
    Author(s):
    Erjavec, Tomaž ; et al.show everyone Erjavec, Tomaž ; Fišer, Darja ; Čibej, Jaka ; Arhar Holdt, Špela ; Ljubešić, Nikola ; Zupan, Katja ; Dobrovoljc, Kaja
     This item contains 7 files (5.68 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    Training corpus jos1M 1.2
    (Jožef Stefan Institute / 2019-02-13)
    
    Author(s):
    Erjavec, Tomaž ; Krek, Simon and Dobrovoljc, Kaja
     This item contains 4 files (108.6 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Noncommercial

  • corpus
    CLARIN.SI data & tools
    corpus
    Corpus of Written Standard Slovene Gigafida 2.0
    (Centre for Language Resources and Technologies, University of Ljubljana / 2019-06-13)
    
    Author(s):
    Krek, Simon ; et al.show everyone Krek, Simon ; Erjavec, Tomaž ; Repar, Andraž ; Čibej, Jaka ; Arhar Holdt, Špela ; Gantar, Polona ; Kosem, Iztok ; Robnik-Šikonja, Marko ; Ljubešić, Nikola ; Dobrovoljc, Kaja ; Laskowski, Cyprian ; Grčar, Miha ; Holozan, Peter ; Šuster, Simon ; Gorjanc, Vojko ; Stabej, Marko ; Logar, Nataša
     This item contains no files.

  • corpus
    CLARIN.SI data & tools
    corpus
    CMC training corpus Janes-Tag 3.0
    (Jožef Stefan Institute / 2022-12-06)
    
    Author(s):
    Lenardič, Jakob ; et al.show everyone Lenardič, Jakob ; Čibej, Jaka ; Arhar Holdt, Špela ; Erjavec, Tomaž ; Fišer, Darja ; Ljubešić, Nikola ; Zupan, Katja ; Dobrovoljc, Kaja
     This item contains 2 files (8.63 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    Spoken Torlak dialect corpus 1.0 (transcription)
    (Slavisches Seminar, University of Zurich / 2020-09-01)
    
    Author(s):
    Vuković, Teodora
     This item contains 4 files (19.34 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Noncommercial

  • toolService
    CLARIN.SI data & tools
    toolService
    The Trankit model for linguistic process of standard written Slovenian 1.1
    (Centre for Language Resources and Technologies, University of Ljubljana / 2024-08-29)
    
    Author(s):
    Krsnik, Luka ; Dobrovoljc, Kaja and Terčon, Luka
     This item contains 1 file (143.34 MB).
     
    Publicly Available

  • toolService
    CLARIN.SI data & tools
    toolService
    Trankit model for linguistic processing of spoken Slovenian
    (Centre for Language Resources and Technologies, University of Ljubljana / 2024-01-17)
    
    Author(s):
    Krsnik, Luka and Dobrovoljc, Kaja
     This item contains 1 file (145.13 MB).
     
    Publicly Available

  • toolService
    CLARIN.SI data & tools
    toolService
    The Trankit model for linguistic processing of standard Slovenian
    (Centre for Language Resources and Technologies, University of Ljubljana / 2023-09-29)
    
    Author(s):
    Krsnik, Luka and Dobrovoljc, Kaja
     This item contains 1 file (142.95 MB).
     
    Publicly Available

  • corpus
    CLARIN.SI data & tools
    corpus
    Serbian Twitter training corpus ReLDI-NormTagNER-sr 2.1
    (Jožef Stefan Institute / 2019-07-28)
    
    Author(s):
    Ljubešić, Nikola ; Erjavec, Tomaž ; Batanović, Vuk ; Miličević, Maja and Samardžić, Tanja
     This item contains 4 files (4.51 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    Croatian Twitter training corpus ReLDI-NormTagNER-hr 3.0
    (Jožef Stefan Institute / 2023-04-07)
    
    Author(s):
    Ljubešić, Nikola ; Erjavec, Tomaž ; Batanović, Vuk ; Miličević, Maja and Samardžić, Tanja
     This item contains 4 files (8.54 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    Croatian Twitter training corpus ReLDI-NormTagNER-hr 2.1
    (Jožef Stefan Institute / 2019-09-11)
    
    Author(s):
    Ljubešić, Nikola ; Erjavec, Tomaž ; Batanović, Vuk ; Miličević, Maja and Samardžić, Tanja
     This item contains 4 files (4.56 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • toolService
    CLARIN.SI data & tools
    toolService
    The Trankit model for linguistic processing of written and spoken Slovenian 1.2
    (Centre for Language Resources and Technologies, University of Ljubljana / 2024-12-06)
    
    Author(s):
    Krsnik, Luka ; Dobrovoljc, Kaja and Terčon, Luka
     This item contains 1 file (145.51 MB).
     
    Publicly Available

  • toolService
    CLARIN.SI data & tools
    toolService
    Trankit model for SST 2.15 1.1
    (Centre for Language Resources and Technologies, University of Ljubljana / 2024-12-06)
    
    Author(s):
    Krsnik, Luka ; Dobrovoljc, Kaja and Terčon, Luka
     This item contains 1 file (138.81 MB).
     
    Publicly Available

  • toolService
    CLARIN.SI data & tools
    toolService
    The Trankit model for linguistic processing of spoken and written Slovenian 1.1
    (Centre for Language Resources and Technologies, University of Ljubljana / 2024-09-02)
    
    Author(s):
    Krsnik, Luka ; Dobrovoljc, Kaja and Terčon, Luka
     This item contains 1 file (145.44 MB).
     
    Publicly Available

  • toolService
    CLARIN.SI data & tools
    toolService
    The CLASSLA-Stanza model for lemmatisation of spoken Slovenian 2.2
    (Jožef Stefan Institute / 2025-02-07)
    
    Author(s):
    Terčon, Luka ; Dobrovoljc, Kaja and Ljubešić, Nikola
     This item contains 1 file (2.09 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • lexicalConceptualResource
    CLARIN.SI data & tools
    lexicalConceptualResource
    Morphological lexicon Sloleks 1.2
    (Centre for Language Resources and Technologies, University of Ljubljana / 2015-06-14)
    
    Author(s):
    Dobrovoljc, Kaja ; Krek, Simon ; Holozan, Peter ; Erjavec, Tomaž and Romih, Miro
     This item contains 5 files (79.77 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Noncommercial Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    Digital library and corpus of historical Slovene IMP 1.1
    (Jožef Stefan Institute / 2014-07-28)
    
    Author(s):
    Erjavec, Tomaž
     This item contains 4 files (338.05 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • 1
  • 2
  •  
  •    
    • Sort items by
    • Relevance
    • Title Asc
    • Title Desc
    • Issue Date Asc
    • Issue Date Desc
    •  
    • Results/page
    • 5
    • 10
    • 20
    •  40
    • 60
    • 80
    • 100
 

Partners

  • Alpineon, d.o.o.
  • Amebis, d.o.o.
  • Institute of Contemporary History
  • Jožef Stefan Institute
  • National and University Library of Slovenia
  • Slovenian Language Technologies Society

Partners

  • University of Ljubljana
  • University of Maribor
  • University of Nova Gorica
  • University of Primorska
  • ZRC SAZU
  • ZRS Koper

Repository

  • Main page
  • Contact
  • Submission Lifecycle
  • FAQ
  • About and Policies

This platform runs under the software developed for the LINDAT/CLARIAH-CZ repository for linguistics, available on GitHub

CLARIN.SI is supported by the Ministry of Education, Science and Sport of the Republic of Slovenia
under the Programme of "Research Infrastructures".