0

I want to extract a number with decimals from a string with one single expression if possible.

For example transform "2,123.02" to "2123.02" - my current solution is:

paste(unlist(str_extract_all("2,123.02","\\(?[0-9.]+\\)?",simplify=F)),collapse="")

But what I'm looking for is the expression in str_extract_all to just bind it together as a vector by themself. Is this possible to achieve with an regular expression?

2 Answers 2

6

You can try replacing the comma by an empty string:

gsub(",", "", "2,123.02")
#[1] "2123.02"

NB: If you need to replace only commas in between numbers, you can use lookarounds:

gsub("(?<=[0-9]),(?=[0-9])", "", "this, this is my number 2,123.02", perl=TRUE)
#[1] "this, this is my number 2123.02"

I edited with sub instead of gsub in case you have strings with more than one number with a comma. In case you only have one, sub is "sufficient".

NB2: You can call str_extrac_all on the result from gsub, e.g.:

str_extract_all(gsub("(?<=[0-9]),(?=[0-9])", "","first number: 2,123.02, second number: 3,456", perl=T), "\\d+\\.*\\d*", simplify=F)
#[[1]]
#[1] "2123.02" "3456"   
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks, this helps my understanding of regular expressions.
Yeah your second line was more what I was looking for, as the strings are more complicated then just removing the "," like in my example. But that did help me, thanks for the extra explanation!
@CathG you don't need to escape commas in regular expressions, even if fixed=FALSE
@MatthewPlourde indeed :-) I see special characters everywhere (well I guess better escape a non-special character than not escaping a "true" one... ;-) ). Edited, thanks!
2

Another option is extract_numeric in the tidyr package.

library(tidyr)
extract_numeric("2,123.02")

[1] 2123.02

5 Comments

I'm not sure how it would work with multiple numbers. The function suppresses anything that is not a digit or a point or a hyphen (minus sign) (as.numeric(gsub("[^0-9.-]+", "", as.character(x)))
@CathG it recognizes minus if that's what you meant?
@Roman, I mean that if the strings is, for example "2,123.02 2,565", it will end in "weird" thing. like all spaces, text, etc. are suppressed
@CathG a string such as that would have to be split first, using either strsplit or scan.
yes, very likely, I guess the final goal is to get the numbers only, and one by one, to compute something, so there actually won't be a problem in the end

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.