R - How to create all n-1 long subsets of a vector and save both the remaining vector and the removed vector efficiently? -


i toying around building recommender system. have historical purchases of users.

my data looks

> head(baskets) # tibble: 6 x 2 # groups:   user_id [2]   user_id     basket     <int>     <list> 1       8 <int [21]> 2       8 <int [13]> 3       8 <int [15]> 4      12 <int [22]> 5      12 <int [20]> 6      12 <int [17]>  > baskets$basket[[1]]  [1]   651  1529  2078  6141  6473  9839 14992 16349 17794 20920 21903  [12] 23165 23400 24838 28985 32030 34190 39110 39812 44099 49533 

okay want remove 1 item each basket , save target item, , save rest of basket new basket. repeated items in basket. if had example user user_id = 1 , basket = [1,2,3] get

user_id   basket   target       1      2,3        1       1      1,3        2       1      1,2        3 

how can construct such data.frame / tibble in efficient way? have solution seems work quite slow, , since have large amount of data find better solution if possible.

currently have

orderdf <- data.frame(user_id = integer(), basket = list(), target =  integer())  for(k in 1:dim(baskets)[1]){   print(k)   currbasket <- baskets$basket[[k]]   currbaskets <- lapply(1:length(currbasket), function(i) currbasket[i])   curruser <- baskets$user_id[k]   for(j in 1:length(currbaskets)){     tempdf <- tibble(user_id = baskets$user_id[k], basket =                       list(currbaskets[[j]]), target = currbasket[j])     orderdf <- rbind(orderdf, tempdf)   } } 

first create myself reproductable dataset

baskets <- data.frame(user_id = 1:10) (i in 1:nrow(df)){   baskets$basket[i] = list(sample(1:100, 3, replace=f)) } head(baskets) 

next time, please provide reproductable set!

the next thing build function handle 1 line:

x = baskets[1,] x$basket = x$basket[[1]] require(data.table) foraline <- function(x){   n_inbasket <- length(unlist(x$basket))   result <- data.table(user_id = rep(x$user_id, n_inbasket))   result$basket <- sapply(1:n_inbasket, function(i){list(unlist(x$basket)[-i])})   result$target <- x$basket   return(result) } foraline(x) 

ok , now, apply on lines , reduce in 1 data.frame using rbindlist data.table package.

require(data.table) order_basket <- rbindlist(apply(baskets, 1, foraline)) head(order_basket) 

Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

jquery - Responsive Navbar with Sub Navbar -