10

I have the following string

str="HelloWorld How areYou I AmFine"

I want this string into the following array

["Hello","World How are","You I Am", "Fine"]

I have been using the following regex, it splits correctly but it also omits the matching pattern, i also want to retain that pattern. What i get is

str.split(/[a-z][A-Z]/)
 => ["Hell", "orld How ar", "ou I A", "ine"] 

It omitts the matching pattern.

Can any one help me out how to retain these characters as well in the resulting array

3 Answers 3

7

In Ruby 1.9 you can use positive lookahead and positive lookbehind (lookahead and lookbehind regex constructs are also called zero-width assertions). They match characters, but then give up the match and only return the result, thus you won't loose your border characters:

str.split /(?<=[a-z])(?=[A-Z])/
=> ["Hello", "World How are", "You I Am", "Fine"] 

Ruby 1.8 does not support lookahead/lookbehind constructs. I recommend to use ruby 1.9 if possible.

If you are forced to use ruby 1.8.7, I think regex won't help you and the best solution I can think of is to build a simple state machine: iterate over each character in your original string and build first string until you encounter border condition. Then build second string etc.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for the answer, but it gives me the following erro ruby-1.8.7-p302 > str="HelloWorld How areYou I AmFine" => "HelloWorld How areYou I AmFine" ruby-1.8.7-p302 > str.split /(?<=[a-z])(?=[A-Z])/ SyntaxError: compile error (irb):987: undefined (?...) sequence: /(?<=[a-z])(?=[A-Z])/ from (irb):987 from :0
So what would be the solution in 1.8? I have to use 1.8.7
I found the answer from one of my colleague, for 1.8.7, do the following. str.underscore.split(/_/).each do |s| s.capitalize! end
@alex-kliuchnikau you can do this 1.8 using #scan instead of #split. Then you don't need the lookbehind.
5

Three answers so far, each with a limitation: one is rails-only and breaks with underscore in original string, another is ruby 1.9 only, the third always has a potential error with its special character. I really liked the split on zero-width assertion answer from @Alex Kliuchnikau, but the OP needs ruby 1.8 which doesn't support lookbehind. There's an answer that uses only zero-width lookahead and works fine in 1.8 and 1.9 using String#scan instead of #split.

str.scan /.*?[a-z](?=[A-Z]|$)/
=> ["Hello", "World How are", "You I Am", "Fine"]

1 Comment

+1 for the scan lookahead -- your solution is safer, faster, shorter, and better than mine. :)
-1

I think this will do the job for you

str.underscore.split(/_/).each do |s| 
s.capitalize! 
end

5 Comments

Note for the future visitors: This will work for Rails and will not work for pure ruby code because underscore is a Rails-specific method.
Note for the future: This will work only if there aren't any underscores in the original text.
Note For Future: In my case, i have been using rails and the strings that i have been manipulating does not contains '_' but the concerns raised by @joelparkerhenderson and @Alex are absolutely valid and must be consider befor using underscore function. Thanks @joel and @Alex again.
@nadeem-yasin this answer you've accepted is not a nice one, suffering both special character bug and requiring rails library.
I agree with dbenhur; his scan lookahead is shorter, faster, and safer that this solution and than my proposed solutions. IMHO you should change to accept dbenhur's.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.