1

I am a rookie in Regex for Ruby. I read some tutorials and evaluated a piece of code. Please let me know if I can do it in a better way.

Here is my text which needs to be split at {iwsection(*)} and {{usersection}}

    t='{{iwsection(1)}}
    This has some sample text 1 - line 1
    This has some sample text 1 - line 2
    {{iwsection(2)}}
    This has some sample text 2
    {{iwsection(3)}}
    This has some sample text 3
    {{usersection}}
    This is a user section.
    This has some sample text
    This has some sample text'

Here is the ruby regex code I was able to manage.

    t.split(/^({{[i|u][wsection]\w*...}})/)

Thank You.

The Desired Output : A array as,

    [ '{{iwsection(1)}}', 'This has some sample text 1\nThis has some sample text 1 - line 2',
    '{{iwsection(2)}}', 'This has some sample text 2',
    '{{iwsection(3)}}', 'This has some sample text 3',
    '{{usersection}}', 'This is a user section\nThis has some sample text\nThis has some sample text.']

With this I will build a Hash,

    { 
    '{{iwsection(1)}}' => 'This has some sample text 1\nThis has some sample text 1 - line 2',
    '{{iwsection(2)}}' => 'This has some sample text 2',
    '{{iwsection(3)}}' => 'This has some sample text 3',
    '{{usersection}}' => 'This is a user section\nThis has some sample text\nThis has some sample text.'
    }

Edit: .....

The code.

    section_array = text.chomp.split(/\r\n|\n/).inject([]) do |a, v|
    if v =~ /{{.*}}/
      a << [v.gsub(/^{{|}}$/, ""), []]
    else
      a.last[1] << v
    end
    a
    end.select{ |k, v| (k.start_with?("iwsection") || k.start_with?("usersection")) }.map{ |k, v| ["{{#{k}}}", v.join("\n")] }
8
  • 2
    Whats your desired output array? Please post an example of what you would want the results to look like. Commented Aug 17, 2014 at 20:05
  • You shouldn't have the Rails tag here, as this is a pure-Ruby question. Having a superfluous tag may cause some to waste time, others (who filter out Rails questions) to not see the question. Commented Aug 17, 2014 at 20:22
  • @CarySwoveland Thanks. I somehow missed this. Commented Aug 18, 2014 at 6:35
  • @CodyCaughlan The desired output updated.. Commented Aug 18, 2014 at 6:35
  • Depending on what you are actually trying to do, it looks like either a config parser (e.g. parseconfig) or a templating solution (e.g. Mustache) could possibly solve your problem in a cleaner way. Commented Aug 18, 2014 at 13:44

2 Answers 2

1

Using String#scan:

> t.scan(/{{([^}]*)}}\r?\n(.*?)\r?(?=\n{{|\n?$)/)
=> [["iwsection(1)", "This has some sample text 1"], ["iwsection(2)", "This has some sample text 2"], ["iwsection(3)", "This has some sample text 3"], ["usersection", "This is a user section."]]

> h = t.scan(/{{([^}]*)}}\r?\n(.*?)\r?(?=\n{{|\n?$)/).to_h
=> {"iwsection(1)"=>"This has some sample text 1", "iwsection(2)"=>"This has some sample text 2", "iwsection(3)"=>"This has some sample text 3", "usersection"=>"This is a user section."}

> h.values
=> ["This has some sample text 1", "This has some sample text 2", "This has some sample text 3", "This is a user section."]

> h.keys
=> ["iwsection(1)", "iwsection(2)", "iwsection(3)", "usersection"]

> h["usersection"]
=> "This is a user section."

Update:

#!/usr/bin/env ruby
t = "{{iwsection(1)}}\nThis has some sample text 1 - line 1\nThis has some sample text 1 - line 2\n{{iwsection(2)}}\nThis has some sample text 2\n{{iwsection(3)}}\nThis has some sample text 3\nThis has some sample text\nThis has some sample text\n{{usersection}}\nThis is a user section.\nThis has some sample text\nThis has some sample text"
h = t.chomp.split(/\n/).inject([]) do |a, v|
  if v =~ /{{.*}}/
    a << [v.gsub(/^{{|}}$/, ""), []]
  else
    a.last[1] << v
  end
  a
end.select{ |k, v| k.start_with? "iwsection" or k === "usersection" }.map{ |k, v| [k, v.join("\n")] }.to_h
puts h.inspect

Output:

{"iwsection(1)"=>"This has some sample text 1 - line 1\nThis has some sample text 1 - line 2", "iwsection(2)"=>"This has some sample text 2", "iwsection(3)"=>"This has some sample text 3\nThis has some sample text\nThis has some sample text", "usersection"=>"This is a user section.\nThis has some sample text\nThis has some sample text"}
Sign up to request clarification or add additional context in comments.

8 Comments

Wow..!! I did not know about this method in String class. I will give it a try. This actually fits my desired output.
Looks like this , scans any text between {{}} and prepares a array/hash. Can we limit this to only text with "iwsection" and "usersection"
@rupeshj Rather than making the regex dirty, just select() needed text instead: t.scan(/{{([^}]*)}}\r?\n(.*?)\r?(?=\n{{|\n?$)/).select{ |k, v| k.start_with? "iwsection" || k == "usersection" }
Sure, Thanks. I will use select.
This one captures only one line, I am not getting the result when I have multiple lines in the section. :(
|
0

You can do that like this:

t.split(/{{iwsection\(\d+\)}}|{{usersection}}/)
  #=> ["", "\n    This has some sample text 1\n    ",
  #    "\n    This has some sample text 2\n    ",
  #    "\n    This has some sample text 3\n    ",
  #    "\n    This is a user section."]

That's what you asked for, but if you want to clean that up, add .map(&:strip):

t.split(/{{iwsection\(\d+\)}}|{{usersection}}/).map(&:strip).map(&:strip)
  #=> ["", "This has some sample text 1", "This has some sample text 2",
  #    "This has some sample text 3", "This is a user section."]

You may not want the empty string at offset zero, but that's how String#split works when you are splitting on a substring that is at the beginning of the string. Suppose the string were instead:

t =
'Some text here{{iwsection(1)}}
This has some sample text 1
{{iwsection(2)}}
This has some sample text 2'

t.split(/{{iwsection\(\d+\)}}|{{usersection}}/).map(&:strip).map(&:strip)
  #=> ["Some text here", "This has some sample text 1",
  #    "This has some sample text 2"]

Here you want "Some text here", so you can't just delete the first element of the array.

Additional requirements

To satisfied your added requirement, you could do this:

t='{{iwsection(1)}}
Text 1 - line 1
Text 1 - line 2
{{iwsection(2)}}
Text 2
{{iwsection(3)}}
Text 3
{{usersection}}
User section.
Text
Text' 

h = t.scan(/(?:{{iwsection\(\d+\)}}|{{usersection}})/)
     .zip(t.split(/{{iwsection\(\d+\)}}|{{usersection}}/)[1..-1])
     .map { |s1,s2| [s1, s2.strip
                           .lines
                           .map(&:strip)
                           .join("\n")] }
     .to_h
  #=> {"{{iwsection(1)}}"=>"Text 1 - line 1\nText 1 - line 2",
  #    "{{iwsection(2)}}"=>"Text 2",
  #    "{{iwsection(3)}}"=>"Text 3",
  #    "{{usersection}}"=>"User section.\nText\nText"}

Note that this formatting may not be understood by IRB or PRY, but will work fine from the command line.

Explanation

a = t.scan(/(?:{{iwsection\(\d+\)}}|{{usersection}})/)
  #=> ["{{iwsection(1)}}", "{{iwsection(2)}}", "{{iwsection(3)}}", "{{usersection}}"]
b = t.split(/{{iwsection\(\d+\)}}|{{usersection}}/)
  #=> ["", "\n    Text 1 - line 1\n    Text 1 - line 2\n    ",
  #    "\n    Text 2\n    ", "\n    Text 3\n    ",
  #    "\n    User section.\n    Text\n    Text"]
c = b[1..-1]
  #=> ["\n    Text 1 - line 1\n    Text 1 - line 2\n    ",
  #    "\n    Text 2\n    ", "\n    Text 3\n    ",
  #    "\n    User section.\n    Text\n    Text"]
h = a.zip(c)
  #=> [["{{iwsection(1)}}", "\n    Text 1 - line 1\n    Text 1 - line 2\n    "],
  #    ["{{iwsection(2)}}", "\n    Text 2\n    "],
  #    ["{{iwsection(3)}}", "\n    Text 3\n    "],
  #    ["{{usersection}}", "\n    User section.\n    Text\n    Text"]]
d = h.map { |s1,s2| [s1, s2.strip
                           .lines
                           .map(&:strip)
                           .join("\n")] }
  #=> [["{{iwsection(1)}}", "Text 1 - line 1\nText 1 - line 2"],
  #    ["{{iwsection(2)}}", "Text 2"], ["{{iwsection(3)}}", "Text 3"],
  #    ["{{usersection}}", "User section.\nText\nText"]]
d.to_h
  #=> {"{{iwsection(1)}}"=>"Text 1 - line 1\nText 1 - line 2",
  #    "{{iwsection(2)}}"=>"Text 2",
  #    "{{iwsection(3)}}"=>"Text 3",
  #    "{{usersection}}"=>"User section.\nText\nText"} 

1 Comment

How can I retain the iwsection or at the number inside iwsection/usersection when I split it ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.