regex - How do I filter "CUSTOM" out of a string but not "CUSTOMER" in R? Grepl? -


i'm trying filter out custom items data frame in r using item descriptions. want rid of items "custom" in description, need keep items "customer" in description. tried using grepl function no avail. i've got 800,000+ rows of data, speedy helpful. 1 filter out of many, using dplyr , pipe operators other filters.

generic code:

> items <- c("a", "b", "c") > desc <- c("custom stamp", "customer 4x6 in stamp", "4x6 generic stamp") > df <- data.frame(items = items, item_desc = desc) > df   items               item_desc 1                custom stamp 2     b   customer 4x6 in stamp 3     c       4x6 generic stamp 

i've tried this:

library(dplyr) df <- df %>%          filter(!grepl("custom", item_desc, fixed = true)) 

but obviously, result is:

> df   items         item_desc 1     c 4x6 generic stamp 

whereas desired result be:

> df   items               item_desc 1     b   customer 4x6 in stamp 2     c       4x6 generic stamp 

thanks!

you need use regular expression here utilizes word boundaries, "\\bcustom\\b".

to make work, need remove fixed=true argument makes engine treat pattern literal string, not pattern.

use

df <- df %>%          filter(!grepl("\\bcustom\\b", item_desc))     

see what pattern matches. items not match remain in df because result of grepl inverted ! operator.


Comments

Popular posts from this blog

What is happening when Matlab is starting a "parallel pool"? -

angular - DownloadURL return null in below code -

php - Cannot override Laravel Spark authentication with own implementation -