performance - How do I efficiently find the number of tweets & retweets in a time span using R? (TwitteR package) -


i want find out number of tweets, favourites , retweets (cummulative enough) of uk general election candidates of several parties (>2000 candidates) in 2 months before election. far have tried make loop using twitter's usertimeline, , (in loop, because don't know how save otherwise) saving number of tweets , retweets , favourites.

current list twitter usernames. i'm programming newby, please don't hate:

tweetsy.2017 <- function(x){     1 = usertimeline(x,  n =3200, includerts = true,excludereplies=false)     onedf = twlisttodf(one)     oneperiod = subset(onedf, created >= as.posixct('2017-04-18 00:00:00') & created <= as.posixct('2017-06-08 23:59:00')) #61 days     oneperiod2 = oneperiod[oneperiod$isretweet == false,]     ro = nrow(oneperiod)     f = sum(oneperiod$favoritecount)     re = sum(oneperiod$retweetcount)     output = list(ro, f, re)     return(output) #sys.sleep(100) }  tweets.2017 = lapply(current, tweetsy.2017) 

my problem is, takes long , gives no intermediate data. also, seems inefficient download tweets number of them. oh, , put sleep there in case reach api limit, seems code slow reach anyway.

does have better idea? have tried mclapply , parlapply haven't managed them running..

wrapped loop, can have intermediate results. works fine now!

for(i in 1:nrow(current)){     print(paste("row number ", , " of ", nrow(twitter_data)))     id <- twitter_data[i, 1]     print(as.vector(id))     ab[[i]] <-  tweetsy.2017(id)     print("process sleeps few seconds due twitter api security      issues , continue")     sys.sleep(9) } 

Comments

Popular posts from this blog

What is happening when Matlab is starting a "parallel pool"? -

angular - DownloadURL return null in below code -

php - Cannot override Laravel Spark authentication with own implementation -