• Repository
  • About
  • Contact
  • CLARIN
  •  Login
  • English Slovenščina
  • CLARIN.SI repository
  • Search
  • CLARIN logo
  •   Browse  
    •    All of the Repository  
      •   Issue Date
      •   Authors
      •   Titles
      •   Subjects
      •   Publisher
      •   Language
      •   Type
      •   Rights Label
  •   My Account  
    •    Login
  •   General Information  
    •    Deposit
    •    Cite
    •    Submission Lifecycle
    •    FAQ
    •    About
    •    Help Desk
 

 
Advanced Search

Filters

Use filters to refine the search results.

Current Filters:
New Filters:

Limit your search

Author  
    • Ljubešić, Nikola (94)
    • Rupnik, Peter (27)
    • Erjavec, Tomaž (25)
    • Kuzman, Taja (19)
    • Batanović, Vuk (15)
    • Esplà-Gomis, Miquel (13)
    • Toral, Antonio (13)
    • Terčon, Luka (12)
    • Bañón, Marta (10)
    • Forcada, Mikel L. (10)
    • García-Romero, Cristian (10)
    • Pla Sempere, Leopoldo (10)
    • Ramírez-Sánchez, Gema (10)
    • Suchomel, Vít (10)
    • van Noord, Rik (10)
    • Fišer, Darja (9)
    • Samardžić, Tanja (9)
    • Stanković, Ranka (9)
    • Bogetić, Ksenija (8)
    • Chichirau, Malina (8)
    • ... View More
Subject  
    • language model (23)
    • web corpus (23)
    • part-of-speech tagging (20)
    • multilingual (19)
    • computer-mediated communication (17)
    • TEI (16)
    • lemmatisation (15)
    • parallel corpus (14)
    • manual annotation (13)
    • lexicographic resource (11)
    • parliamentary debates (11)
    • news corpus (10)
    • modern dictionary (9)
    • named entities (8)
    • news comments (8)
    • parsing (8)
    • tokenisation (8)
    • Croatian Parliament (7)
    • Czech Parliament (7)
    • monolingual dictionary (7)
    • ... View More
Rights  
    • PUB (125)
    • ACA (7)
Language (ISO)  
    • Croatian (87)
    • Serbian (78)
    • English (40)
    • Slovenian (35)
    • Bosnian (26)
    • Bulgarian (15)
    • Estonian (14)
    • Hungarian (14)
    • Latvian (14)
    • Polish (14)
    • German (13)
    • Russian (13)
    • Spanish (13)
    • Czech (12)
    • Dutch (12)
    • Swedish (12)
    • Finnish (11)
    • Montenegrin (11)
    • Catalan (10)
    • Danish (10)
    • ... View More
Type  
    • text (125)
    • corpus (89)
    • lexicalConceptualResource (39)
    • toolService (25)
    • audio (4)
Contain Files  
    • yes (132)
    • no (21)

Showing 1 through 10 out of 153 results

  • 1
  • 2
  • 3
  •  
  • 16
  •    
    • Sort items by
    •  Relevance
    • Title Asc
    • Title Desc
    • Issue Date Asc
    • Issue Date Desc
    •  
    • Results/page
    • 5
    •  10
    • 20
    • 40
    • 60
    • 80
    • 100

  • corpus
    CLARIN.SI data & tools
    corpus
    Text collection for training the BERTić transformer model BERTić-data
    (Jožef Stefan Institute / 2021-05-05)
    
    Author(s):
    Ljubešić, Nikola
     This item contains 10 files (21.14 GB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    The Twitter user dataset for discriminating between Bosnian, Croatian, Montenegrin and Serbian Twitter-HBS 1.0
    (Jožef Stefan Institute / 2022-01-26)
    
    Author(s):
    Ljubešić, Nikola and Rupnik, Peter
     This item contains 1 file (12.98 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • lexicalConceptualResource
    CLARIN.SI data & tools
    lexicalConceptualResource
    Database of the Western South Slavic Verb HyperVerb 2.0 -- WeSoSlav
    (University of Graz; University of Nova Gorica / 2024-12-10)
    
    Author(s):
    Arsenijević, Boban ; et al.show everyone Arsenijević, Boban ; Gomboc Čeh, Katarina ; Marušič, Franc Lanko ; Milosavljević, Stefan ; Mišmaš, Petra ; Simić, Jelena ; Simonović, Marko ; Žaucer, Rok
     This item contains 3 files (11.43 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required

  • corpus
    CLARIN.SI data & tools
    corpus
    Comparable corpora of South-Slavic Wikipedias CLASSLA-Wikipedia 1.0
    (Jožef Stefan Institute / 2021-05-05)
    
    Author(s):
    Ljubešić, Nikola ; Markoski, Filip ; Markoska, Elena and Erjavec, Tomaž
     This item contains 7 files (5.04 GB).
     
    Publicly Available Distributed under Creative Commons Attribution Required

  • corpus
    CLARIN.SI data & tools
    corpus
    The news dataset for discriminating between Bosnian, Croatian and Serbian SETimes.HBS 1.0
    (Jožef Stefan Institute / 2022-01-26)
    
    Author(s):
    Ljubešić, Nikola and Rupnik, Peter
     This item contains 1 file (20.15 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    The sentiment corpus of parliamentary debates ParlaSent-BCS v1.0
    (Jožef Stefan Institute / 2022-06-08)
    
    Author(s):
    Mochtak, Michal ; Rupnik, Peter and Ljubešić, Nikola
     This item contains 1 file (1.13 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    Corpus of Bosnia and Herzegovina language-related news comments MetaLangNEWS-COMMENTS-Bs
    (ZRC SAZU; Regional Linguistic Data Initiative Centre ReLDI / 2022-09-30)
    
    Author(s):
    Bogetić, Ksenija ; Milinković, Michael and Batanović, Vuk
     This item contains 2 files (1.88 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Noncommercial Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    Corpus of Bosnia and Herzegovina language-related news articles MetaLangNEWS-Bs
    (ZRC SAZU; Regional Linguistic Data Initiative Centre ReLDI / 2022-09-30)
    
    Author(s):
    Bogetić, Ksenija ; Milinković, Michael and Batanović, Vuk
     This item contains 2 files (2.03 MB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Noncommercial Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    Heritage Bosnian, Croatian, and Serbian spoken by Second Generation Speakers in Germany He-BCS-Ge
    (University of Regensburg; University of Zurich / 2024-11-10)
    
    Author(s):
    Romić, Daniel
     This item contains 1 file (109.1 KB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Noncommercial

  • corpus
    CLARIN.SI data & tools
    corpus
    Genre-enriched web corpora MaCoCu-Genre
    (Jožef Stefan Institute / 2024-10-07)
    
    Author(s):
    Kuzman, Taja and Ljubešić, Nikola
     This item contains 14 files (101.43 GB).
     
    Publicly Available

  • 1
  • 2
  • 3
  •  
  • 16
  •    
    • Sort items by
    •  Relevance
    • Title Asc
    • Title Desc
    • Issue Date Asc
    • Issue Date Desc
    •  
    • Results/page
    • 5
    •  10
    • 20
    • 40
    • 60
    • 80
    • 100
 

Partners

  • Alpineon, d.o.o.
  • Amebis, d.o.o.
  • Institute of Contemporary History
  • Jožef Stefan Institute
  • National and University Library of Slovenia
  • Slovenian Language Technologies Society

Partners

  • University of Ljubljana
  • University of Maribor
  • University of Nova Gorica
  • University of Primorska
  • ZRC SAZU
  • ZRS Koper

Repository

  • Main page
  • Contact
  • Submission Lifecycle
  • FAQ
  • About and Policies

This platform runs under the software developed for the LINDAT/CLARIAH-CZ repository for linguistics, available on GitHub

CLARIN.SI is supported by the Ministry of Education, Science and Sport of the Republic of Slovenia
under the Programme of "Research Infrastructures".