ASR training dataset for Croatian ParlaSpeech-HR v1.0
http://hdl.handle.net/11356/1494

The ParlaSpeech-HR.v1.0.jsonl (json lines) file consists of entries with the following attributes:

path: name of the file with the segment recording
orig_file: name of the original file harvested from YouTube
start: second when the segment starts in the original file
end: second when the segment ends in the original file
words: list of words from the original transcript
word_start_times: relative time references (in seconds) to each word
norm_words: list of words normalized with an imperfect rule-based normaliser
norm_words_start_times: relative time references (in seconds) to each word in the normalized transcript
utterance_id_start: ID of the utterance in the ParlaMint 2.1 corpus (http://hdl.handle.net/11356/1432) where the segment starts
utterance_id_end: ID of the utterance in the ParlaMint 2.1 corpus where the segment ends
speaker_info: list of speaker attributes from ParlaMint 2.1, if single speaker (null otherwise)
split: either "train", "dev", or "test", or "null" if multiple speakers