Datoteke v tem vnosu
Prenesi vse datoteke v vnosu (4.01 MB)To je vnos
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Publicly Available
z licenco:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)





- Ime
- README.txt
- Velikost
- 1.53 KB
- Format
- Besedilna datoteka
- Opis
- Description of the format.
- MD5
- feaf50df36f2595ced352c0d45469ba8
The Vejica corpus is stored as a tab delimited text file with two columns: ID and one sentence. The sentences are UTF-8 plain text with with "÷" (U+00F7) in place of superfluous comma and "¤" (U+00A4) for missing comma. The IDs encode the source of the sampled sentence and start as folows: KUST.de. - corpus KUST, first language German KUST.en. - corpus KUST, first language English KUST.es. - corpus KUST, first language Spanish KUST.it. - corpus KUST, first language Italian KUST.sh. - corpus KUST, first language Croatian, Serbinan or Bosnian Solar.G1. - corpus Šolar, grammar school, 1st grade Solar.G2. - corpus Šolar, grammar school, 2nd grade Solar.G3. - corpus Šolar, grammar school, 3rd grade Solar.G4. - corpus Šolar, grammar school, 4th grade Solar.OS6. - corpus Šolar, primary school, 6th grade Solar.OS7. - corpus Šolar, primary school, 7th grade Solar.OS8. - corpus Šolar, primary school, 8th grade Solar.OS9. - corpus Šolar, primary school, 9th grade Sola . . .

- Ime
- vejica10.zip
- Velikost
- 4.01 MB
- Format
- application/zip
- Opis
- Tab delimited text file with two columns: ID and sentence.
- MD5
- db6b1a854660fe1f142f80e56e10e250