Files in this item

This item is
Publicly Available
and licensed under:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Distributed under Creative Commons Attribution Required Share Alike
Icon
Name
konvNormSl.zip
Size
4.57 MB
Format
application/zip
Description
Dataset
MD5
98a809350431cce453224e842a413212
 Download file  Preview
 File Preview  
  • konvNormSl
    • README.txt1 kB
    • token
      • dev
        • goo300k-gaj.token.dev.norm.txt302 kB
        • tweet-L3.token.dev.norm.txt57 kB
        • tweet-L1.token.dev.orig.txt57 kB
        • goo300k-gaj.token.dev.orig.txt303 kB
        • tweet-L3.token.dev.orig.txt56 kB
        • goo300k-bohoric.token.dev.norm.txt82 kB
        • tweet-L1.token.dev.norm.txt57 kB
        • goo300k-bohoric.token.dev.orig.txt85 kB
      • train
        • goo300k-bohoric.token.train.orig.txt733 kB
        • tweet-L1.token.train.orig.txt452 kB
        • goo300k-gaj.token.train.norm.txt2 MB
        • tweet-L3.token.train.norm.txt484 kB
        • goo300k-gaj.token.train.orig.txt2 MB
        • tweet-L3.token.train.orig.txt471 kB
        • goo300k-bohoric.token.train.norm.txt705 kB
        • tweet-L1.token.train.norm.txt454 kB
      • test
        • tweet-L3.token.test.orig.txt58 kB
        • goo300k-gaj.token.test.norm.txt314 kB
        • goo300k-gaj.token.test.orig.txt314 kB
        • tweet-L1.token.test.norm.txt58 kB
        • goo300k-bohoric.token.test.norm.txt85 kB
        • tweet-L3.token.test.norm.txt60 kB
        • tweet-L1.token.test.orig.txt58 kB
        • goo300k-bohoric.token.test.orig.txt88 kB
    • segment
      • dev
        • goo300k-gaj.segment.dev.norm.txt255 kB
        • tweet-L3.segment.dev.norm.txt48 kB
        • goo300k-bohoric.segment.dev.norm.txt69 kB
        • goo300k-gaj.segment.dev.orig.txt256 kB
        • tweet-L3.segment.dev.orig.txt47 kB
        • goo300k-bohoric.segment.dev.orig.txt72 kB
        • tweet-L1.segment.dev.norm.txt48 kB
        • tweet-L1.segment.dev.orig.txt48 kB
      • train
        • goo300k-gaj.segment.train.norm.txt1 MB
        • goo300k-bohoric.segment.train.norm.txt593 kB
        • tweet-L3.segment.train.orig.txt394 kB
        • tweet-L1.segment.train.orig.txt385 kB
        • goo300k-gaj.segment.train.orig.txt1 MB
        • goo300k-bohoric.segment.train.orig.txt621 kB
        • tweet-L3.segment.train.norm.txt407 kB
        • tweet-L1.segment.train.norm.txt386 kB
      • test
        • tweet-L3.segment.test.orig.txt48 kB
        • goo300k-bohoric.segment.test.orig.txt74 kB
        • tweet-L1.segment.test.norm.txt49 kB
        • tweet-L1.segment.test.orig.txt49 kB
        • goo300k-gaj.segment.test.norm.txt264 kB
        • tweet-L3.segment.test.norm.txt49 kB
        • goo300k-bohoric.segment.test.norm.txt71 kB
        • goo300k-gaj.segment.test.orig.txt264 kB