Files in this item
Download all files in item (213.86 KB)This item is
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)




- Name
- readme_web_crawlers.txt
- Size
- 1.3 KB
- Format
- Text file
- Description
- README file
- MD5
- 4c42c0b4f2097f31cecde483d937ca29
Author: Jože Bučar, Faculty of Information Studies in Novo mesto (contact: joze.bucar@gmail.com) Abstract: Five web-crawlers written in the R language for retrieving Slovenian news texts from the portals 24ur, Dnevnik, Finance, Rtvslo, and Žurnal24. These portals contain political, business, economic and financial content. Web crawlers are written in the R language. Keywords: Web-crawling, Slovene Web resources: - Slovenian news texts with political, business, economic and financial content published between 1 September 2007 and 31 January 2016 from five Slovenian web media from five web media: www.24ur.com, www.dnevnik.si, www.finance.si, www.rtvslo.si, www.zurnal24.si Type and size: - .R (web-crawlers); size: 213 KB Encoding: ANSI Year: Last update 2016-02-14 Attributes (retrieved news): URL main - Uniform Resource Locator (URL) of the resource (web medium) [string; www.24ur.com, www.dnevnik.si, www.finance.si, www.rtvslo.si, www.zurnal24.si] URL - URL of . . .

- Name
- web_crawler_24UR.r
- Size
- 61.36 KB
- Format
- Unknown
- Description
- Web crawler for 24ur
- MD5
- 683b72b1265c9bc64464cbe3e8df278b

- Name
- web_crawler_Dnevnik.r
- Size
- 16.5 KB
- Format
- Unknown
- Description
- Web crawler for Dnevnik
- MD5
- 704bbf20657684968a4260cc0d76fb5e

- Name
- web_crawler_Finance.r
- Size
- 50.96 KB
- Format
- Unknown
- Description
- Web crawler for Finance
- MD5
- 436b1103bb5ddabf4aadd5aaee652584

- Name
- web_crawler_RTVSLO.r
- Size
- 41.08 KB
- Format
- Unknown
- Description
- Web crawler for Rtvslo
- MD5
- 201e4b0ad829c52a06f2c0a8cf027643

- Name
- web_crawler_Zurnal24.r
- Size
- 42.66 KB
- Format
- Unknown
- Description
- Web crawler for Zurnal24
- MD5
- 694b2ba88f6aa905563f58218e7b3b38