r - dplyr split by semi colon in case_when -
suppose have dataframe df
library(dplyr) df <- data.frame(id = c(1:10), type = c('a', 'a;b','b','a','b','b','c','a;c','b;c','c')) and want add column called color, based on values appear in type. (this example, in code there many more variations of type, i.e. d;f, e;q,a;z etc)
df %>% mutate(color = case_when( type == 'a' ~ 'red', type == 'b' ~ 'blue', type == 'c' ~ 'green', true ~ as.character(type) )) as stands, returns
id type color 1 1 red 2 2 a;b a;b 3 3 b blue 4 4 red 5 5 b blue 6 6 b blue 7 7 c green 8 8 a;c a;c 9 9 b;c b;c 10 10 c green i curious if there way split semi-colon within case_when(), in order produce output
id type color 1 1 red 2 2 a;b red;blue 3 3 b blue 4 4 red 5 5 b blue 6 6 b blue 7 7 c green 8 8 a;c red;green 9 9 b;c blue;green 10 10 c green
you can split type column separate rows, map colors , paste them together:
library(dplyr); library(tidyr); df %>% separate_rows(type) %>% mutate(color = case_when( type == 'a' ~ 'red', type == 'b' ~ 'blue', type == 'c' ~ 'green', true ~ as.character(type) )) %>% group_by(id) %>% summarise_all(funs(paste0(., collapse=";"))) # tibble: 10 x 3 # id type color # <int> <chr> <chr> # 1 1 red # 2 2 a;b red;blue # 3 3 b blue # 4 4 red # 5 5 b blue # 6 6 b blue # 7 7 c green # 8 8 a;c red;green # 9 9 b;c blue;green #10 10 c green besides case_when, can put character color maps in vector , retrieve colors later:
map <- c(a = 'red', b = 'blue', c = 'green') df %>% separate_rows(type) %>% mutate(color = map[type]) %>% group_by(id) %>% summarise_all(funs(paste0(., collapse=";")))
Comments
Post a Comment