r - Create new column with information of another column if two elements of row are equal -
im new r , did not find solution specific problem. hope guys can me.
i have following data frame:
hid <- c('1','2','2','2','2','4','4','4','4','4','4') syear <- c(2000,2001,2003,2003,2003,2000,2000,2001,2001,2002,2002) employlvl <- c('full-time','part-time','part-time','unemployed','unemployed','full-time','full-time','full-time','unemployed','part-time', 'full-time') relhead <- c('head','head','head','partner','child','head','partner','head','partner','head','partner') df <- data.frame(hid,syear,employlvl,relhead) | hid | syear | employment | relation head of hh| |-----|-------|-------------|-----------------------| | 1 | 2000 | full-time | head | | 2 | 2001 | part-time | head | | 2 | 2003 | part-time | head | | 2 | 2003 | unemployed | partner | | 2 | 2003 | unemployed | child | | 4 | 2000 | full-time | head | | 4 | 2000 | full-time | partner | | 4 | 2001 | full-time | head | | 4 | 2001 | unemployed | partner | | 4 | 2002 | part-time | head | | 4 | 2002 | full-time | partner |
i create new column employment level of partner if values in hid (household identification number) , syear (survey year) equal.
i hope following output:
| hid | syear | employment | relation head of hh| employment partner| |-----|-------|-------------|-----------------------|-------------------| | 1 | 2000 | part-time | head | na | | 2 | 2001 | part-time | head | na | | 2 | 2003 | part-time | head | unemployed | | 2 | 2003 | unemployed | partner | na | | 2 | 2003 | unemployed | child | na | | 4 | 2000 | full-time | head | full-time | | 4 | 2000 | full-time | partner | na | | 4 | 2001 | full-time | head | unemployed | | 4 | 2001 | unemployed | partner | na | | 4 | 2002 | part-time | head | full-time | | 4 | 2002 | full-time | partner | na |
thank in advance!
we achieve using dplyr
, tidyr
. there 2 steps.
step 1: find out hid
, syear
combinations have more 2 records. filter them , filter out records child
. use spread
find head
, partner
relationship, creating new data frame. create new column head
merging. dt2
output of step.
step 2: use left_join
combine dt2
original data frame dt
. dt3
final output.
library(dplyr) library(tidyr) dt2 <- dt %>% group_by(hid, syear) %>% filter(n() > 1) %>% filter(`relation head of hh` != "child") %>% spread(`relation head of hh`, employment) %>% mutate(relation = "head") %>% rename(`employment partner` = partner) %>% select(-head) dt3 <- dt %>% left_join(dt2, = c("hid", "syear", "relation head of hh" = "relation"))
data:
library(dplyr) dt <- data_frame(hid = c(1, 2, 2, 2, 2, 4, 4, 4, 4, 4, 4), syear = c(2000, 2001, 2003, 2003, 2003, 2000, 2000, 2001, 2001, 2002, 2002), employment = c("full-time", "part-time", "part-time", "unemployed", "unemployed", "full-time", "full-time", "full-time", "unemployed", "part-time", "full-time"), "relation head of hh" = c("head", "head", "head", "partner", "child", "head", "partner", "head", "partner", "head", "partner"))
Comments
Post a Comment