Face-domain-specific automatic speech recognition models

Name: Face-domain-specific automatic speech recognition models
License: https://opensource.org/licenses/Apache-2.0

Dobrišek, Simon; Križaj, Janez; Ivanovska, Marija; Grm, Klemen

Show simple item record

dc.contributor.author	Dobrišek, Simon
dc.contributor.author	Križaj, Janez
dc.contributor.author	Ivanovska, Marija
dc.contributor.author	Grm, Klemen
dc.date.accessioned	2023-03-11T12:56:34Z
dc.date.available	2023-03-11T12:56:34Z
dc.date.issued	2023-02-24
dc.identifier.uri	http://hdl.handle.net/11356/1749
dc.description	This entry contains all the files required to implement face-domain-specific automatic speech recognition (ASR) applications using the Kaldi ASR toolkit (https://github.com/kaldi-asr/kaldi), including the acoustic model, language model, and other relevant files. It also includes all the scripts and configuration files needed to use these models for implementing face-domain-specific automatic speech recognition. The acoustic model was trained using the relevant Kaldi ASR tools (https://github.com/kaldi-asr/kaldi) and the Artur speech corpus (http://hdl.handle.net/11356/1776; http://hdl.handle.net/11356/1772). The language model was trained using the domain-specific text data involving face descriptions obtained by translating the Face2Text English dataset (https://github.com/mtanti/face2text-dataset) into the Slovenian language. These models, combined with other necessary files like the HCLG.fst and decoding scripts, enable the implementation of face-domain-specific ASR applications. Two speech corpora ("test" and "obrazi") and two Kaldi ASR models ("graph_splosni" and "graph_obrazi") can be selected for conducting speech recognition tests by setting the variable "graph" and "test_sets" in the "local/test_recognition.sh" script. Acoustic speech features can be extracted and speech recognition tests can be conducted using the "local/test_recognition.sh" script. Speech recognition test results can be obtained using the "results.sh" script. The KALDI_ROOT environment variable also needs to be set in the script "path.sh" to set the path to the Kaldi ASR toolkit installation folder.
dc.language.iso	slv
dc.publisher	Faculty of Electrical Engineering, University of Ljubljana
dc.relation.isreferencedby	https://github.com/clarinsi/rsdo_fdsasr_v2
dc.rights	Apache License 2.0
dc.rights.uri	https://opensource.org/licenses/Apache-2.0
dc.rights.label	PUB
dc.source.uri	https://rsdo.slovenscina.eu/govorne-tehnologije
dc.subject	automatic speech recognition
dc.subject	acoustic model
dc.subject	language model
dc.subject	Kaldi ASR toolkit
dc.title	Face-domain-specific automatic speech recognition models
dc.type	toolService
metashare.ResourceInfo#ContentInfo.detailedType	tool
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent	true
has.files	yes
branding	CLARIN.SI data & tools
contact.person	Simon Dobrišek simon.dobrisek@fe.uni-lj.si Faculty of Electrical Engineering, University of Ljubljana
sponsor	Ministry of Culture C3340-20-278001 Development of Slovene in a Digital Environment Other
files.count	1
files.size	11991262806

Files in this item

This item is

Publicly Available

and licensed under:
Apache License 2.0

Name: FDSASR-Kaldi-Models.tgz
Size: 11.17 GB
Format: Unknown
Description: Gzipped tar of the models
MD5: 439dd2a8f5a2851feafaef60c63c9909

Download file

Show simple item record

Files in this item

Partners

Partners

Repository