What's New

 corpus 
corpus
Description:
The Trendi corpus is a monitor corpus of Slovenian. It contains news articles from 106 media websites, published by 62 publishers. Trendi 2026-05 covers the period from January 2019 to Maj 2026, complementing the Gigafida ...
 This item contains no files.
 corpus 
corpus
Author(s):
Description:
This entry contains the first part of the e-book "Bobrček Tonči" (Tony the Beaver) by author Maja Pakiž (COBISS ID: 278141699, ISBN: 978-961-7249-58-3). Tony, the little beaver from Forest Village, loves nothing more ...
 This item contains 1 file (11.94 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike
 corpus 
corpus
Description:
GaMS-Instruct-SAFE is a an instruction-following safety dataset designed to fine-tune Slovene large language models to provide safe responses (i.e. to train them to refuse responding to prompts that could lead to physical, ...
 This item contains 1 file (150.54 KB).
 
Publicly Available Distributed under Creative Commons Attribution Required Share Alike

Most Viewed Items

Top Last Week
 corpus 
corpus
Author(s):
Erjavec, Tomaž ; et al.show everyone Erjavec, Tomaž ; Kopp, Matyáš ; Kuzman Pungeršek, Taja ; Ljubešić, Nikola ; Ogrodniczuk, Maciej ; Osenova, Petya ; Agirrezabal, Manex ; Agnoloni, Tommaso ; Aires, José ; Albini, Monica ; Alkorta, Jon ; Antiba-Cartazo, Iván ; Arrieta, Ekain ; Barcala, Mario ; Bardanca, Daniel ; Barkarson, Starkaður ; Bartolini, Roberto ; Battistoni, Roberto ; Bel, Nuria ; Bonet Ramos, Maria del Mar ; Calzada Pérez, María ; Cardoso, Aida ; Çöltekin, Çağrı ; Coole, Matthew ; Darģis, Roberts ; de Libano, Ruben ; Depoorter, Griet ; Diwersy, Sascha ; Dodé, Réka ; Fernandez, Kike ; Fernández Rei, Elisa ; Frontini, Francesca ; Garcia, Marcos ; García Díaz, Noelia ; García Louzao, Pedro ; Gavriilidou, Maria ; Gkoumas, Dimitris ; Grigorov, Ilko ; Grigorova, Vladislava ; Haltrup Hansen, Dorte ; Iruskieta, Mikel ; Jarlbrink, Johan ; Jelencsik-Mátyus, Kinga ; Jongejan, Bart ; Kahusk, Neeme ; Kirnbauer, Martin ; Kryvenko, Anna ; Ligeti-Nagy, Noémi ; Luxardo, Giancarlo ; Magariños, Carmen ; Magnusson, Måns ; Marchetti, Carlo ; Marx, Maarten ; Meden, Katja ; Mendes, Amália ; Mochtak, Michal ; Mölder, Martin ; Montemagni, Simonetta ; Navarretta, Costanza ; Nitoń, Bartłomiej ; Norén, Fredrik Mohammadi ; Nwadukwe, Amanda ; Ojsteršek, Mihael ; Pančur, Andrej ; Papavassiliou, Vassilis ; Pereira, Rui ; Pérez Lago, María ; Piperidis, Stelios ; Pirker, Hannes ; Pisani, Marilina ; Pol, Henk van der ; Prokopidis, Prokopis ; Quochi, Valeria ; Rayson, Paul ; Regueira, Xosé Luís ; Rii, Andriana ; Rudolf, Michał ; Ruisi, Manuela ; Rupnik, Peter ; Schopper, Daniel ; Simov, Kiril ; Sinikallio, Laura ; Skubic, Jure ; Tungland, Lars Magne ; Tuominen, Jouni ; van Heusden, Ruben ; Varga, Zsófia ; Vázquez Abuín, Marta ; Venturi, Giulia ; Vidal Miguéns, Adrián ; Vider, Kadri ; Vivel Couso, Ainhoa ; Vladu, Adina Ioana ; Wissik, Tanja ; Yrjänäinen, Väinö ; Zevallos, Rodolfo ; Fišer, Darja
Description:
ParlaMint 5.0 is a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions, mostly starting in 2015 and extending to mid-2022. The individual corpora ...
 This item contains 31 files (5.94 GB).
 
Publicly Available Distributed under Creative Commons Attribution Required
 corpus 
corpus
Author(s):
Description:
goo300k is a manually annotated reference corpus of historical Slovene. It contains 1,100 pages (about 300,000 tokens) sampled from 89 texts from the period 1584-1899. Each text contains extensive meta-data and per-page ...
 This item contains 2 files (8.9 MB).
 
Publicly Available Distributed under Creative Commons Attribution Required
 corpus 
corpus
Author(s):
Kuzman Pungeršek, Taja ; et al.show everyone Kuzman Pungeršek, Taja ; Ljubešić, Nikola ; Erjavec, Tomaž ; Kopp, Matyáš ; Ogrodniczuk, Maciej ; Osenova, Petya ; Rayson, Paul ; Vidler, John ; Agerri, Rodrigo ; Agirrezabal, Manex ; Agnoloni, Tommaso ; Aires, José ; Albini, Monica ; Alkorta, Jon ; Antiba-Cartazo, Iván ; Arrieta, Ekain ; Barcala, Mario ; Bardanca, Daniel ; Barkarson, Starkaður ; Bartolini, Roberto ; Battistoni, Roberto ; Bel, Nuria ; Bonet Ramos, Maria del Mar ; Calzada Pérez, María ; Cardoso, Aida ; Çöltekin, Çağrı ; Coole, Matthew ; Darģis, Roberts ; de Does, Jesse ; de Libano, Ruben ; Depoorter, Griet ; Depuydt, Katrien ; Diwersy, Sascha ; Dodé, Réka ; Fernandez, Kike ; Fernández Rei, Elisa ; Frontini, Francesca ; Garcia, Marcos ; García Díaz, Noelia ; García Louzao, Pedro ; Gavriilidou, Maria ; Gkoumas, Dimitris ; Grigorov, Ilko ; Grigorova, Vladislava ; Haltrup Hansen, Dorte ; Iruskieta, Mikel ; Jarlbrink, Johan ; Jelencsik-Mátyus, Kinga ; Jongejan, Bart ; Kahusk, Neeme ; Kirnbauer, Martin ; Kryvenko, Anna ; Ligeti-Nagy, Noémi ; Luxardo, Giancarlo ; Magariños, Carmen ; Magnusson, Måns ; Marchetti, Carlo ; Marx, Maarten ; Meden, Katja ; Mendes, Amália ; Mochtak, Michal ; Mölder, Martin ; Montemagni, Simonetta ; Navarretta, Costanza ; Nitoń, Bartłomiej ; Norén, Fredrik Mohammadi ; Nwadukwe, Amanda ; Ojsteršek, Mihael ; Pančur, Andrej ; Papavassiliou, Vassilis ; Pereira, Rui ; Pérez Lago, María ; Piperidis, Stelios ; Pirker, Hannes ; Pisani, Marilina ; Pol, Henk van der ; Prokopidis, Prokopis ; Quochi, Valeria ; Regueira, Xosé Luís ; Rii, Andriana ; Rudolf, Michał ; Ruisi, Manuela ; Rupnik, Peter ; Schopper, Daniel ; Simov, Kiril ; Sinikallio, Laura ; Skubic, Jure ; Tamper, Minna ; Tungland, Lars Magne ; Tuominen, Jouni ; van Heusden, Ruben ; Varga, Zsófia ; Vázquez Abuín, Marta ; Venturi, Giulia ; Vidal Miguéns, Adrián ; Vider, Kadri ; Vivel Couso, Ainhoa ; Vladu, Adina Ioana ; Wissik, Tanja ; Yrjänäinen, Väinö ; Zevallos, Rodolfo ; Fišer, Darja
Description:
ParlaMint-en.ana 5.0 is the English machine translation of the ParlaMint.ana 5.0 (http://hdl.handle.net/11356/2005) set of corpora of parliamentary debates across Europe. The translation keeps the structure and metadata ...
 This item contains 31 files (57.16 GB).
 
Publicly Available Distributed under Creative Commons Attribution Required