Skip to content

Commit a9846be

Browse files
committed
implemented gzfile to work directly with gzipped files
1 parent 351c09f commit a9846be

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

data-analysis.R

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# First analysis script
22

3-
# file extracted from a gzipped file at URL: https://dumps.wikimedia.org/other/pagecounts-all-sites/2015/2015-11/pagecounts-20151101-060000.gz
4-
pagecounts.20151101.060000 <- read.csv2("pagecounts-20151101-060000", header=FALSE, row.names=NULL, sep="", stringsAsFactors=FALSE)
3+
# data extracted directly from downloaded gzipped file at URL: https://dumps.wikimedia.org/other/pagecounts-all-sites/2015/2015-11/pagecounts-20151101-060000.gz
4+
pagecounts.20151101.060000 <- read.csv2(gzfile("pagecounts-20151101-060000.gz"), header=FALSE, row.names=NULL, sep="", stringsAsFactors=FALSE)# load csv file derived from the source code of URL: https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Medicine/Lists_of_pages/Articles
55
# load csv file derived from the source code of URL: https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Medicine/Lists_of_pages/Articles
66
wikiproject_medicine <- read.csv2("wikiproject_medicine.csv", header=FALSE, stringsAsFactors=FALSE)
77
# join both dataframes (rough solution to cope with default names - TO DO: rename fields)

0 commit comments

Comments
 (0)