Prikaži enostavni zapis vnosa

 
dc.contributor.author Bučar, Jože
dc.date.accessioned 2017-04-23T17:46:05Z
dc.date.available 2017-04-23T17:46:05Z
dc.date.issued 2017-04-23
dc.identifier.uri http://hdl.handle.net/11356/1105
dc.description Five web-crawlers written in the R language for retrieving Slovenian texts from the news portals 24ur, Dnevnik, Finance, Rtvslo, and Žurnal24. These portals contain political, business, economic and financial content.
dc.language.iso slv
dc.publisher Faculty of Information Studies Novo mesto
dc.relation.isreferencedby https://doi.org/10.1007/s10579-018-9413-3
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-sa/4.0/
dc.rights.label PUB
dc.source.uri https://github.com/19Joey85/Sentiment-annotated-news-corpus-and-sentiment-lexicon-in-Slovene/
dc.subject web crawling
dc.subject R
dc.title R crawlers for five Slovenian web media 1.0
dc.type toolService
metashare.ResourceInfo#ContentInfo.detailedType tool
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent false
has.files yes
branding CLARIN.SI data & tools
contact.person Jože Bučar joze.bucar@gmail.com Laboratory of Data Technologies, Faculty of Information Studies in Novo mesto, Slovenia
sponsor ARRS (Slovenian Research Agency) MR-35498 Young Researcher Programme nationalFunds
sponsor Human Resources Development and Scholarship Fund, Ministry of Education, Science and Sport, Slovenia 11012-55/2015 Javni razpis financiranja raziskovalnega sodelovanja dohtorskih študentov v tujini v letu 2014 (186. JR) nationalFunds
sponsor The European Regional Development Fund Operational Programme for Strengthening Regional Development Potentials for the period 2007-2013 Other
files.count 6
files.size 218992


 Datoteke v tem vnosu

 Prenesi vse datoteke v vnosu (213.86 KB)
Icon
Ime
readme_web_crawlers.txt
Velikost
1.3 KB
Format
Besedilna datoteka
Opis
README file
MD5
4c42c0b4f2097f31cecde483d937ca29
 Prenesi datoteko  Predogled
 Predogled datoteke  
Author: Jože Bučar, Faculty of Information Studies in Novo mesto (contact: joze.bucar@gmail.com)

Abstract:
Five web-crawlers written in the R language for retrieving Slovenian news texts from the portals 24ur, Dnevnik, Finance, Rtvslo, and Žurnal24. These portals contain political, business, economic and financial content. Web crawlers are written in the R language.

Keywords:
Web-crawling, Slovene

Web resources:
- Slovenian news texts with political, business, economic and financial content published between 1 September 2007 and 31 January 2016 from five Slovenian web media from five web media: www.24ur.com, www.dnevnik.si, www.finance.si, www.rtvslo.si, www.zurnal24.si

Type and size:
- .R (web-crawlers); size: 213 KB

Encoding: ANSI

Year: Last update 2016-02-14

Attributes (retrieved news):
URL main - Uniform Resource Locator (URL) of the resource (web medium) [string; www.24ur.com, www.dnevnik.si, www.finance.si, www.rtvslo.si, www.zurnal24.si]
URL - URL of . . .
                                            
Icon
Ime
web_crawler_24UR.r
Velikost
61.36 KB
Format
Neznano
Opis
Web crawler for 24ur
MD5
683b72b1265c9bc64464cbe3e8df278b
 Prenesi datoteko
Icon
Ime
web_crawler_Dnevnik.r
Velikost
16.5 KB
Format
Neznano
Opis
Web crawler for Dnevnik
MD5
704bbf20657684968a4260cc0d76fb5e
 Prenesi datoteko
Icon
Ime
web_crawler_Finance.r
Velikost
50.96 KB
Format
Neznano
Opis
Web crawler for Finance
MD5
436b1103bb5ddabf4aadd5aaee652584
 Prenesi datoteko
Icon
Ime
web_crawler_RTVSLO.r
Velikost
41.08 KB
Format
Neznano
Opis
Web crawler for Rtvslo
MD5
201e4b0ad829c52a06f2c0a8cf027643
 Prenesi datoteko
Icon
Ime
web_crawler_Zurnal24.r
Velikost
42.66 KB
Format
Neznano
Opis
Web crawler for Zurnal24
MD5
694b2ba88f6aa905563f58218e7b3b38
 Prenesi datoteko

Prikaži enostavni zapis vnosa