0

I have a column of texts look like below:

str1 = "ABCID 123456789 is what I'm looking for, could you help me to check this Item's status?"

I want to use gsub function in R to extract "ABCID 123456789" from there. The number might change with different numbers, but ABCID is a constant. Can someone know the solution with that please? Thanks very much!

3 Answers 3

2

We can use str_extract to select the fixed word followed by space and one or more numbers (\\d+)

library(stringr)
str_extract(df1$col1, "ABCID \\d+")

If there are multiple instances, use str_extract_all

str_extract_all(df1$col1, "ABCID \\d+")

NOTE: The OP states that to extract "ABCID 123456789" from there

Sign up to request clarification or add additional context in comments.

Comments

1

If the number has constant length (9) you could you use positive lookbehind:

sub("(?<=ABCID \\d{9}).*", "", str1, perl = TRUE)
# [1] "ABCID 123456789"

Comments

1

Match the beginning of string (^) leading letters (ABCID), a space, digits (\d+) and everything else (.*) and replace it all with the captured portion, i.e. the portion within parentheses. Note that we want to use sub, not gsub, here because there is only one substitution.

sub("^(ABCID \\d+).*", "\\1", str1)
## [1] "ABCID 123456789"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.