I am trying to split a text where it is between \n\n and \n, in that order. Take this string for example:
\n\nMy take on fruits.\n\nHealthy Fruits\nAn apple is a fruit and it\'s very good.\n\nPears are good as well. Bananas are very good too and healthy.\n\nSour Fruits\nOranges are on the sour side and contains a lot of vitamin C.\n\nGrapefruits are even more sour, if you can believe it.
My desired output is:
[('Healthy Fruits', "An apple is a fruit and it's very good.", 'Pears are good as well. Bananas are very good too and healthy.'), ('Sour Fruits', 'Oranges are on the sour side and contains a lot of vitamin C.', 'Grapefruits are even more sour, if you can believe it.')]
I want to parse like this because anything between \n\n and \n is the title and the rest is text under the title (So "Healthy Fruits" and "Sour Fruits" . Not sure if this is the best way to grab the titles and its text.
re.split(r'\n\n?, ur_txt)re.findall('(?<!\n)\n\n(.+)\n(?!\n)((?s:.*?))(?=\n\n|\Z)', text)will do.[('Healthy Fruits', "An apple is a fruit and it's very good. Bananas are very good too and healthy."), ('Sour Fruits', 'Oranges are on the sour side and contains a lot of vitamin C.', 'Grapefruits are even more sour, if you can believe it.')]