dc.contributor.author | Jelovšek, Tjaša |
dc.contributor.author | Lebar Bajec, Iztok |
dc.contributor.author | Bajec, Marko |
dc.contributor.author | Bajec, Žan |
dc.contributor.author | Cvek, Jernej |
dc.date.accessioned | 2022-12-02T10:50:11Z |
dc.date.available | 2022-12-02T10:50:11Z |
dc.date.issued | 2022-12-01 |
dc.identifier.uri | http://hdl.handle.net/11356/1743 |
dc.description | This Text Denormalisator converts Slovene spoken-form text into written-form text. Typically it is used as a post-processing step in Automatic Speech Recognition, which traditionally outputs spoken-form text. As input it accepts text in either string form, list of tokens, or a list of dictionaries with a mandatory "text" field. The output is a dictionary. Example of use: denormalize("Danes, osmega sedmega dva tisoč dvaindvajset, je lep sončen dan, saj je zunaj prijetnih petindvajset stopinj Celzija.") {'denormalized_content': [{'text': 'Danes', 'index': [0]}, {'text': ',', 'index': [1]}, {'text': '8.', 'index': [2]}, {'text': '7.', 'index': [3]}, {'text': '2022', 'index': [4, 5, 6]}, {'text': ',', 'index': [7]}, {'text': 'je', 'index': [8]}, {'text': 'lep', 'index': [9]}, {'text': 'sončen', 'index': [10]}, {'text': 'dan', 'index': [11]}, {'text': ',', 'index': [12]}, {'text': 'saj', 'index': [13]}, {'text': 'je', 'index': [14]}, {'text': 'zunaj', 'index': [15]}, {'text': 'prijetnih', 'index': [16]}, {'text': '25', 'index': [17]}, {'text': '°C', 'index': [18, 19]}, {'text': '.', 'index': [20]}], 'denormalized_string': 'Danes, 8. 7. 2022, je lep sončen dan, saj je zunaj prijetnih 25 °C.'} |
dc.language.iso | slv |
dc.publisher | Faculty of Computer and Information Science, University of Ljubljana |
dc.relation.isreferencedby | https://rsdo.slovenscina.eu/en/speech-technologies |
dc.rights | Apache License 2.0 |
dc.rights.uri | https://opensource.org/licenses/Apache-2.0 |
dc.rights.label | PUB |
dc.source.uri | https://github.com/clarinsi/Slovene_denormalizator |
dc.subject | text denormalisation |
dc.title | Slovene Text Denormalizator RSDO-DS2-DENORM 1.0 |
dc.type | toolService |
metashare.ResourceInfo#ContentInfo.detailedType | tool |
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent | true |
has.files | yes |
branding | CLARIN.SI data & tools |
contact.person | Iztok Lebar Bajec ilb@fri.uni-lj.si Faculty of Computer and Information Science, University of Ljubljana |
sponsor | Ministry of Culture C3340-20-278001 Development of Slovene in a Digital Environment Other |
files.count | 1 |
files.size | 9349120 |
Datoteke v tem vnosu

- Ime
- Slovene_denormalizator-1.0.tar
- Velikost
- 8.92 MB
- Format
- Neznano
- Opis
- RSDO DS2 DENORM 1.0
- MD5
- 53a4894927e9b29d02c30efa729286b2