r - Subset dataframe based of non-sequential dates -
i have data looks this
df<-data.frame(datecol=as.date(c("2010-04-03","2010-04-04","2010-04-05","2010-04-06","2010-04-07", "2010-04-03","2010-04-04","2010-04-05","2010-04-06","2010-04-07", "2010-05-06","2010-05-07","2010-05-09","2010-06-06","2010-06-07")),x=c(1,1,1,0,1,1,1,0,0,0,1,0,0,0,1),type=c(rep("a",5),rep("b",5),rep("c",5))) > df datecol x type 1 2010-04-03 1 2 2010-04-04 1 3 2010-04-05 1 4 2010-04-06 0 5 2010-04-07 1 6 2010-04-03 1 b 7 2010-04-04 1 b 8 2010-04-05 0 b 9 2010-04-06 0 b 10 2010-04-07 0 b 11 2010-05-06 1 c 12 2010-05-07 0 c 13 2010-05-09 0 c 14 2010-06-06 0 c 15 2010-06-07 1 c
i need subset dataframe type, keep "types" have 2 or more different dates , dates @ least 1 day apart. in above example type has 4 different dates, , type c has 2 different dates more 1 day apart, want save these 2 new dataframe. type b has 2 different dates, not 1 day apart, don't want keep it.
i thinking in loop count how many unique date within each type, leave has more 2 different dates. @ ones have 2 different dates , calculate distance between them , leave ones distance more 1. seems there should more efficient way. ideas?
one solution data.table
:
#make sure datecol date df$datecol <- as.date(df$datecol) library(data.table) #x needs 1 , date difference more day per type #then in second [] select trues setdt(df)[x == 1, diff(datecol) > 1, = type][v1 == true, type] #[1] c #levels: b c
Comments
Post a Comment