Split a string into a string and an integer

Question

I have several strings that contain one or more digits and may also contain one or more letters following the digits (caps on letters don't matter). The strings follow the following regex pattern:

[0-9]+[a-zA-z]*

and may look like:

"15791"
"14810A"
"10480ABCD"
"5ABCDEFGH"

If one of the strings above contains non-numerical characters, how do I split the numbers (first part) into an integer and the letters (second part) into a string?

I know I can split a string like this:

array = "1,2,3,4".split(',')

But this doesn't help since I don't have a separator.

Good question and well-written: succinct, complete, unambiguous. — Cary Swoveland
– Cary Swoveland, Commented Mar 18, 2015 at 17:22

Avinash Raj · Accepted Answer · 2015-03-18 10:53:29Z

11

Use a positive lookbehind assertion based regex in string.split.

> "10480ABCD".split(/(?<=\d)(?=[A-Za-z])/)
=> ["10480", "ABCD"]

(?<=\d) Positive lookbehind which asserts that the match must be preceded by a digit character.
(?=[A-Za-z]) which asserts that the match must be followed by an alphabet. So the above regex would match the boundary which exists between a digit and an alphabet. Splitting your input according to the matched boundary will give you the desired output.

OR

Use string.scan

> "10480ABCD".scan(/\d+|[A-Za-z]+/)
=> ["10480", "ABCD"]

edited Mar 18, 2015 at 10:53

answered Mar 18, 2015 at 10:37

Avinash Raj

175k32 gold badges247 silver badges289 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

NickEckhart Over a year ago

Wow, that was prompt! Tried it in irb and works like a charm! Thanks so much, I'll accept the answer in 12 minutes :)

humza Over a year ago

avinash, if the string will have only digits and alphabets, it would be best to stick with \d and \D to keep the expression simple. like in my response. :)

Avinash Raj Over a year ago

op clearly mention that his string must satisfy [0-9]+[a-zA-z]* pattern. Changing [A-Za-z]+ to \D+ won't much more difference.

humza Over a year ago

since we know the string will have only digits and alphabets, using \d and \D keeps things simple and readable. as an argument: 3 and 4 can be added using 3 + 4 or 1 + 1 + 1 + 4. results are the same. but simplicity is preferred.

sawa · Accepted Answer · 2015-03-18 11:12:53Z

9

The splitter is the non-numerical characters themselves:

"10480ABCD".split(/(\D+)/)
# => ["10480", "ABCD"]

answered Mar 18, 2015 at 11:12

sawa

169k51 gold badges288 silver badges401 bronze badges

1 Comment

Cary Swoveland Over a year ago

Clever. Puzzled readers (if any): one line of the docs for String#split reads, "If pattern contains [capture] groups, the respective matches will be returned in the array as well."

Stefan · Accepted Answer · 2015-03-18 11:02:49Z

0

You can always use match:

re = /(\d+)([a-z]*)/i
str = "10480ABCD"

m = re.match(str)
m    #=> #<MatchData "10480ABCD" 1:"10480" 2:"ABCD">
m[0] #=> "10480"
m[1] #=> "ABCD"

Use MatchData#[] to extract capture groups:

re.match(str)[1, 2]
["10480", "ABCD"]

answered Mar 18, 2015 at 11:02

Stefan

115k14 gold badges157 silver badges234 bronze badges

Comments

Cary Swoveland · Accepted Answer · 2015-03-18 17:31:57Z

0

[Edit: for some reason @Humza deleted his answer, so I've undeleted mine. I had previously posted this, but then deleted it when I noticed that Humza had already posted a similar answer.]

I feel like I must be missing something, as it seems to have a straightforward solution:

def extract(str)
  str.scan(/\d+|[A-Z]+/i)
end

extract "15791"     #=> ["15791"] 
extract "14810A"    #=> ["14810", "A"] 
extract "10480ABCD" #=> ["10480", "ABCD"]
extract "5ABCDEFGH" #=> ["5", "ABCDEFGH"]

edited Mar 18, 2015 at 17:31

answered Mar 18, 2015 at 16:56

Cary Swoveland

111k6 gold badges69 silver badges105 bronze badges

Collectives™ on Stack Overflow

Split a string into a string and an integer

4 Answers 4

4 Comments

1 Comment

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

4 Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related