2026-07-17T16:36:51Zhttp://www.clarin.si/repository/oai/request

oai:www.clarin.si:11356/17492023-03-27T17:01:16Zhdl_11356_1023hdl_11356_1024

Face-domain-specific automatic speech recognition models Dobrišek, Simon Križaj, Janez Ivanovska, Marija Grm, Klemen automatic speech recognition acoustic model language model Kaldi ASR toolkit This entry contains all the files required to implement face-domain-specific automatic speech recognition (ASR) applications using the Kaldi ASR toolkit (https://github.com/kaldi-asr/kaldi), including the acoustic model, language model, and other relevant files. It also includes all the scripts and configuration files needed to use these models for implementing face-domain-specific automatic speech recognition. The acoustic model was trained using the relevant Kaldi ASR tools (https://github.com/kaldi-asr/kaldi) and the Artur speech corpus (http://hdl.handle.net/11356/1776; http://hdl.handle.net/11356/1772). The language model was trained using the domain-specific text data involving face descriptions obtained by translating the Face2Text English dataset (https://github.com/mtanti/face2text-dataset) into the Slovenian language. These models, combined with other necessary files like the HCLG.fst and decoding scripts, enable the implementation of face-domain-specific ASR applications. Two speech corpora ("test" and "obrazi") and two Kaldi ASR models ("graph_splosni" and "graph_obrazi") can be selected for conducting speech recognition tests by setting the variable "graph" and "test_sets" in the "local/test_recognition.sh" script. Acoustic speech features can be extracted and speech recognition tests can be conducted using the "local/test_recognition.sh" script. Speech recognition test results can be obtained using the "results.sh" script. The KALDI_ROOT environment variable also needs to be set in the script "path.sh" to set the path to the Kaldi ASR toolkit installation folder. 2023-02-24 toolService http://hdl.handle.net/11356/1749 slv https://github.com/clarinsi/rsdo_fdsasr_v2 Apache License 2.0 https://opensource.org/licenses/Apache-2.0 PUB text/plain; charset=utf-8 application/octet-stream downloadable_files_count: 1 Faculty of Electrical Engineering, University of Ljubljana https://rsdo.slovenscina.eu/govorne-tehnologije