Prikaži enostavni zapis vnosa

 
dc.contributor.author Tovornik, Robert
dc.contributor.author Pavlović, Anđela
dc.contributor.author Plesnik, Emil
dc.contributor.author Fabjan, Borut
dc.date.accessioned 2024-11-06T16:17:02Z
dc.date.available 2024-11-06T16:17:02Z
dc.date.issued 2024-09-25
dc.identifier.uri http://hdl.handle.net/11356/1982
dc.description GaMS-Instruct-MED is an instruction-following dataset designed to fine-tune Slovene large language models to follow instructions in the medical domain. It consists of pairs of prompts and responses from the field of medicine, particularly those pertaining to the use of pharmaceutical drugs and medications. The dataset was generated in several steps. After consulting with experts from the medical field, a series of prompts was manually compiled containing questions interesting in the context of drug and medication use. For each medication in the PoVeJMo-VeMo-Med 1.0 dataset (http://hdl.handle.net/11356/1983), approximately 10-15 questions were automatically generated using prompt tuning. The questions followed the context of the instructions of use for the medication in question. Inadequate questions were manually excluded, while the responses were generated entirely automatically using a specialized RAG system. Please note that the current version of the dataset (containing 18,897 prompt-response pairs) does not guarantee clinical accuracy and may contain errors as a consequence of LLM hallucinations.
dc.language.iso slv
dc.publisher Better, d.o.o.
dc.publisher Faculty of Computer and Information Science, University of Ljubljana
dc.rights Creative Commons - Attribution 4.0 International (CC BY 4.0)
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.rights.label PUB
dc.source.uri https://www.cjvt.si/povejmo/en/project/
dc.subject instruction following dataset
dc.subject medical texts
dc.subject large language models
dc.title Slovene instruction-following dataset for large language models GaMS-Instruct-MED 1.0
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding CLARIN.SI data & tools
contact.person Borut Fabjan info@better.care Better, d.o.o.
sponsor ARIS (Slovenian Research and Innovation Agency) NOO PoVeJMo research project (Adaptive Natural Language Processing with Large Language Models) nationalFunds
sponsor ARRS (Slovenian Research Agency) P6-0411 Language Resources and Technologies for Slovene nationalFunds
size.info 18897 units
files.count 1
files.size 4801364


 Datoteke v tem vnosu

To je vnos
Publicly Available
z licenco:
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Distributed under Creative Commons Attribution Required
Icon
Ime
GaMS-Instruct-MED_1.0.zip
Velikost
4.58 MB
Format
application/zip
Opis
GaMS-Instruct-MED 1.0 (JSON)
MD5
750933f232205b55d22654a6f46513b9
 Prenesi datoteko  Predogled
 Predogled datoteke  

Prikaži enostavni zapis vnosa