• Repository
  • About
  • Contact
  • CLARIN
  •  Login
  • English Slovenščina
  • CLARIN.SI repository
  • Search
  • CLARIN logo
  •   Browse  
    •    All of the Repository  
      •   Issue Date
      •   Authors
      •   Titles
      •   Subjects
      •   Publisher
      •   Language
      •   Type
      •   Rights Label
  •   My Account  
    •    Login
  •   General Information  
    •    Deposit
    •    Cite
    •    Submission Lifecycle
    •    FAQ
    •    About
    •    Help Desk
 

 
Selected Filters
 Subject : web corpus     Clear All
Advanced Search

Filters

Use filters to refine the search results.

Current Filters:
New Filters:

Limit your search

Author  
    • Ljubešić, Nikola (63)
    • Kuzman, Taja (53)
    • Rupnik, Peter (52)
    • Toral, Antonio (49)
    • Esplà-Gomis, Miquel (48)
    • Bañón, Marta (44)
    • Forcada, Mikel L. (44)
    • García-Romero, Cristian (44)
    • Pla Sempere, Leopoldo (44)
    • Ramírez-Sánchez, Gema (44)
    • Suchomel, Vít (44)
    • van Noord, Rik (44)
    • Chichirau, Malina (28)
    • Galiano-Jiménez, Aarón (28)
    • Zaragoza-Bernabeu, Jaume (28)
    • van der Werff, Tobias (16)
    • Zaragoza, Jaume (16)
    • Klubička, Filip (7)
    • Ortiz Rojas, Sergio (4)
    • Runić, Marija (2)
    • ... View More
Subject  
    • multilingual (27)
    • parallel corpus (27)
    • automatic genre identification (9)
    • genre corpus (8)
    • DSI (2)
    • genre (1)
    • genre classification (1)
    • language model (1)
    • lemmatisation (1)
    • manual annotation (1)
    • news discourse (1)
Rights  
    • PUB (60)
    • ACA (4)
Language (ISO)  
    • English (28)
    • Croatian (9)
    • Serbian (8)
    • Slovenian (8)
    • Bosnian (6)
    • Bulgarian (6)
    • Macedonian (6)
    • Montenegrin (6)
    • Icelandic (5)
    • Turkish (5)
    • Maltese (4)
    • Albanian (3)
    • Catalan (3)
    • Modern Greek (1453-) (3)
    • Ukrainian (3)
    • Finnish (2)
    • Dutch (1)
    • Spanish (1)

Showing 1 through 10 out of 64 results

  • 1
  • 2
  • 3
  •  
  • 7
  •    
    • Sort items by
    • Relevance
    • Title Asc
    •  Title Desc
    • Issue Date Asc
    • Issue Date Desc
    •  
    • Results/page
    • 5
    •  10
    • 20
    • 40
    • 60
    • 80
    • 100

  • corpus
    CLARIN.SI data & tools
    corpus
    Ukrainian-English parallel corpus MaCoCu-uk-en 1.0
    (Jožef Stefan Institute; Prompsit; Rijksuniversiteit Groningen; Universitat d'Alacant / 2023-07-07)
    
    Author(s):
    Bañón, Marta ; et al.show everyone Bañón, Marta ; Chichirau, Malina ; Esplà-Gomis, Miquel ; Forcada, Mikel L. ; Galiano-Jiménez, Aarón ; García-Romero, Cristian ; Kuzman, Taja ; Ljubešić, Nikola ; van Noord, Rik ; Pla Sempere, Leopoldo ; Ramírez-Sánchez, Gema ; Rupnik, Peter ; Suchomel, Vít ; Toral, Antonio ; Zaragoza-Bernabeu, Jaume
     This item contains 3 files (8.18 GB).
     
    Publicly Available

  • corpus
    CLARIN.SI data & tools
    corpus
    Ukrainian web corpus MaCoCu-uk 1.0
    (Jožef Stefan Institute; Prompsit; Rijksuniversiteit Groningen; Universitat d'Alacant / 2023-05-24)
    
    Author(s):
    Bañón, Marta ; et al.show everyone Bañón, Marta ; Chichirau, Malina ; Esplà-Gomis, Miquel ; Forcada, Mikel L. ; Galiano-Jiménez, Aarón ; García-Romero, Cristian ; Kuzman, Taja ; Ljubešić, Nikola ; van Noord, Rik ; Pla Sempere, Leopoldo ; Ramírez-Sánchez, Gema ; Rupnik, Peter ; Suchomel, Vít ; Toral, Antonio ; Zaragoza-Bernabeu, Jaume
     This item contains 2 files (24.58 GB).
     
    Publicly Available

  • corpus
    CLARIN.SI data & tools
    corpus
    Turkish-English parallel corpus MaCoCu-tr-en 2.0
    (Jožef Stefan Institute; Prompsit; Rijksuniversiteit Groningen; Universitat d'Alacant / 2023-04-26)
    
    Author(s):
    Bañón, Marta ; et al.show everyone Bañón, Marta ; Chichirau, Malina ; Esplà-Gomis, Miquel ; Forcada, Mikel L. ; Galiano-Jiménez, Aarón ; García-Romero, Cristian ; Kuzman, Taja ; Ljubešić, Nikola ; van Noord, Rik ; Pla Sempere, Leopoldo ; Ramírez-Sánchez, Gema ; Rupnik, Peter ; Suchomel, Vít ; Toral, Antonio ; Zaragoza-Bernabeu, Jaume
     This item contains 3 files (3.03 GB).
     
    Publicly Available

  • corpus
    CLARIN.SI data & tools
    corpus
    Turkish-English parallel corpus MaCoCu-tr-en 1.0
    (Jožef Stefan Institute; Prompsit; Rijksuniversiteit Groningen; Universitat d'Alacant / 2022-04-25)
    
    Author(s):
    Bañón, Marta ; et al.show everyone Bañón, Marta ; Esplà-Gomis, Miquel ; Forcada, Mikel L. ; García-Romero, Cristian ; Kuzman, Taja ; Ljubešić, Nikola ; van Noord, Rik ; Pla Sempere, Leopoldo ; Ramírez-Sánchez, Gema ; Rupnik, Peter ; Suchomel, Vít ; Toral, Antonio ; van der Werff, Tobias ; Zaragoza, Jaume
     This item contains 2 files (4.56 GB).
     
    Publicly Available

  • corpus
    CLARIN.SI data & tools
    corpus
    Turkish web corpus MaCoCu-tr 2.0
    (Jožef Stefan Institute; Prompsit; Rijksuniversiteit Groningen; Universitat d'Alacant / 2023-04-20)
    
    Author(s):
    Bañón, Marta ; et al.show everyone Bañón, Marta ; Chichirau, Malina ; Esplà-Gomis, Miquel ; Forcada, Mikel L. ; Galiano-Jiménez, Aarón ; García-Romero, Cristian ; Kuzman, Taja ; Ljubešić, Nikola ; van Noord, Rik ; Pla Sempere, Leopoldo ; Ramírez-Sánchez, Gema ; Rupnik, Peter ; Suchomel, Vít ; Toral, Antonio ; Zaragoza-Bernabeu, Jaume
     This item contains 2 files (15.07 GB).
     
    Publicly Available

  • corpus
    CLARIN.SI data & tools
    corpus
    Turkish web corpus MaCoCu-tr 1.0
    (Jožef Stefan Institute; Prompsit; Rijksuniversiteit Groningen; Universitat d'Alacant / 2022-04-29)
    
    Author(s):
    Bañón, Marta ; et al.show everyone Bañón, Marta ; Esplà-Gomis, Miquel ; Forcada, Mikel L. ; García-Romero, Cristian ; Kuzman, Taja ; Ljubešić, Nikola ; van Noord, Rik ; Pla Sempere, Leopoldo ; Ramírez-Sánchez, Gema ; Rupnik, Peter ; Suchomel, Vít ; Toral, Antonio ; van der Werff, Tobias ; Zaragoza, Jaume
     This item contains 3 files (31.42 GB).
     
    Publicly Available

  • corpus
    CLARIN.SI data & tools
    corpus
    Text collection for training the BERTić transformer model BERTić-data
    (Jožef Stefan Institute / 2021-05-05)
    
    Author(s):
    Ljubešić, Nikola
     This item contains 10 files (21.14 GB).
     
    Publicly Available Distributed under Creative Commons Attribution Required Share Alike

  • corpus
    CLARIN.SI data & tools
    corpus
    Slovenian web corpus CLASSLA-web.sl 1.0
    (Jožef Stefan Institute / 2024-03-22)
    
    Author(s):
    Ljubešić, Nikola ; Rupnik, Peter and Kuzman, Taja
     This item contains 2 files (16.36 GB).
     
    Publicly Available

  • corpus
    CLARIN.SI data & tools
    corpus
    Slovene-English parallel corpus slenWaC 1.0
    (Jožef Stefan Institute / 2016-03-10)
    
    Author(s):
    Ljubešić, Nikola ; Esplà-Gomis, Miquel ; Ortiz Rojas, Sergio ; Klubička, Filip and Toral, Antonio
     This item contains 1 file (94.44 MB).
     
    Academic Use Attribution Required Noncommercial

  • corpus
    CLARIN.SI data & tools
    corpus
    Slovene-English parallel corpus MaCoCu-sl-en 2.0
    (Jožef Stefan Institute; Prompsit; Rijksuniversiteit Groningen; Universitat d'Alacant / 2023-04-26)
    
    Author(s):
    Bañón, Marta ; et al.show everyone Bañón, Marta ; Chichirau, Malina ; Esplà-Gomis, Miquel ; Forcada, Mikel L. ; Galiano-Jiménez, Aarón ; García-Romero, Cristian ; Kuzman, Taja ; Ljubešić, Nikola ; van Noord, Rik ; Pla Sempere, Leopoldo ; Ramírez-Sánchez, Gema ; Rupnik, Peter ; Suchomel, Vít ; Toral, Antonio ; Zaragoza-Bernabeu, Jaume
     This item contains 3 files (1.96 GB).
     
    Publicly Available

  • 1
  • 2
  • 3
  •  
  • 7
  •    
    • Sort items by
    • Relevance
    • Title Asc
    •  Title Desc
    • Issue Date Asc
    • Issue Date Desc
    •  
    • Results/page
    • 5
    •  10
    • 20
    • 40
    • 60
    • 80
    • 100
 

Partners

  • Alpineon, d.o.o.
  • Amebis, d.o.o.
  • Institute of Contemporary History
  • Jožef Stefan Institute
  • National and University Library of Slovenia
  • Slovenian Language Technologies Society

Partners

  • University of Ljubljana
  • University of Maribor
  • University of Nova Gorica
  • University of Primorska
  • ZRC SAZU
  • ZRS Koper

Repository

  • Main page
  • Contact
  • Submission Lifecycle
  • FAQ
  • About and Policies

This platform runs under the software developed for the LINDAT/CLARIAH-CZ repository for linguistics, available on GitHub

CLARIN.SI is supported by the Ministry of Education, Science and Sport of the Republic of Slovenia
under the Programme of "Research Infrastructures".