How to read specific lines from text using a starting and ending condition?

Question

I have a document.gca file that contains specific information that I need, I'm trying to extract certain information, in a part of text repeats the next sentences:

#Sta/Elev= xx
(here goes pair numbers)
#Mann

This part of text repeats several times. My goal is to catch (the pair numbers) that are in that interval, and repeat this process in my text. How can I extract that? Say I have this:

Sta/Elev= 259 
   0 2186.31      .3 2186.14      .9 2185.83     1.4 2185.56     2.5 2185.23
   3 2185.04     3.6 2184.83     4.7 2184.61     5.6  2184.4     6.4 2184.17
 6.9 2183.95     7.5 2183.69     7.6 2183.59       8 2183.35     8.6 2182.92
10.2 2181.47    10.8 2181.03    11.3 2180.63    11.9 2180.27    12.4 2179.97
  13 2179.72    13.6 2179.47    14.1  2179.3    14.3 2179.21    14.7 2179.11
15.7  2178.9    17.4 2178.74    17.9 2178.65    20.1 2178.17    20.4 2178.13
20.4 2178.12    21.5 2177.94    22.6 2177.81    22.6  2177.8    22.9 2177.79
24.1 2177.78    24.4 2177.75    24.6 2177.72    24.8 2177.68    25.2 2177.54
    Mann= 3 , 0 , 0 
           0      .2       0    26.9      .2       0    46.1      .2       0
    Bank Sta=26.9,46.1
    XS Rating Curve= 0 ,0
    XS HTab Starting El and Incr=2176.01,0.3, 56 
    XS HTab Horizontal Distribution= 0 , 0 , 0 
    Exp/Cntr(USF)=0,0
    Exp/Cntr=0.3,0.1

    Type RM Length L Ch R = 1 ,2655    ,11.2,11.1,10.5
    XS GIS Cut Line=4
    858341.2470677761196439.12427935858354.9998313071196457.53292637
    858369.2753539641196470.40256485858387.8228168661196497.81690065
    Node Last Edited Time=Aug/05/2019 11:42:02
    Sta/Elev= 245 
     0 2191.01      .8 2190.54     2.5  2189.4       5 2187.76     7.2  2186.4
     8.2 2185.73     9.5 2184.74    10.1 2184.22    10.3 2184.04    10.8 2183.55
    12.8 2180.84    13.1 2180.55    13.3 2180.29    13.9 2179.56    14.2 2179.25
    14.5 2179.03    15.8 2178.18    16.4 2177.81    16.7 2177.65      17 2177.54
    17.1 2177.51    17.2 2177.48    17.5 2177.43    17.6  2177.4    17.8 2177.39
    18.3 2177.37    18.8 2177.37    19.7 2177.44      20 2177.45    20.6 2177.45
    20.7 2177.45    20.8 2177.44      21 2177.42    21.3 2177.41    21.4  2177.4
    21.7 2177.32      22 2177.26    22.1 2177.21    22.2 2177.13    22.5 2176.94
    22.6 2176.79    22.9 2176.54    23.2 2176.19    23.5 2175.88    23.9 2175.68
    24.4 2175.55    24.6 2175.54    24.8 2175.53    24.9 2175.53    25.1 2175.54
    25.7 2175.63      26 2175.71    26.3 2175.78    26.4  2175.8    26.4 2175.82
#Mann= 3 , 0 , 0 
       0      .2       0    22.9      .2       0      43      .2       0
Bank Sta=22.9,43
XS Rating Curve= 0 ,0
XS HTab Starting El and Incr=2175.68,0.3, 51 
XS HTab Horizontal Distribution= 0 , 0 , 0 
Exp/Cntr(USF)=0,0
Exp/Cntr=0.3,0.1

But I want to select the numbers between Sta/Elev and Mann and save as a pair vectors, for each Sta/Elev right now I have this:

import re

with open('a.g01','r') as file:
    file_contents = file.read()
    #print(file_contents)

try:
    found = re.search('#Sta/Elev(.+?)#Mann',file_contents).group(1)
except AttributeError:
    found = '' # apply your error handling

print(found)

found is empty and I want to catch all the numbers in interval '#Sta/Elev and #Mann'

No exactly it changes depending geometry and other parameters specific this is a hec ras result but i must extract values that are between Sta/Elev and Mann and this results repeats several times because there are many elevations (i think is the correct way to say) so, i need extract the numbers that are in that interval — Julian Andres Lastra Garcia
– Julian Andres Lastra Garcia, Commented Sep 12, 2019 at 13:52
For example i got this:\n Sta/Elev=120 \n 0 2191.01 .8 2190.54 2.5 2189.4 5 2187.76 7.2 2186.4 \n #Mann \n says other thing, again. \n Sta/Elev=121 \n 8.2 2185.73 9.5 2184.74 10.1 2184.22 10.3 2184.04 10.8 2183.55 \n #Mann \n i need extract that numbers. — Julian Andres Lastra Garcia
– Julian Andres Lastra Garcia, Commented Sep 12, 2019 at 13:53

Rodolfo Donã Hosp · Accepted Answer · 2019-09-12 16:09:03Z

1

The problem is in your regex, try switching

found = re.search('#Sta/Elev(.+?)#Mann',file_contents).group(1)

to

found = re.search('Sta/Elev(.*)Mann',file_contents).group(1)

output:

>>> import re
>>> file_contents = 'Sta/ElevthisisatestMann'
>>> found = re.search('Sta/Elev(.*)Mann',file_contents).group(1)
>>> print(found)
thisisatest

Edit:

For multiline matching try adding the DOTALL parameter:

found = re.search('Sta/Elev=(.*)Mann',file_contents, re.DOTALL).group(1)

It was not clear to me on what is the separating string, since they are different in your examples, but for that you can just change it in the regex expression

edited Sep 12, 2019 at 16:09

answered Sep 12, 2019 at 13:43

Rodolfo Donã Hosp

1,0571 gold badge11 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Julian Andres Lastra Garcia Over a year ago

i appreciate your help but its not working how i say in past comment i have this text: \n Sta/Elev=120 \n 0 2191.01 .8 2190.54 2.5 2189.4 5 2187.76 7.2 2186.4 \n #Mann \n says other thing, again. \n Sta/Elev=121 \n 8.2 2185.73 9.5 2184.74 10.1 2184.22 10.3 2184.04 10.8 2183.55 \n #Mann \n i need extract that numbers.

Rodolfo Donã Hosp Over a year ago

After the file_contents parameter add re.DOTALL parameter. And also the = after Sta/Elev, thought it wasn't part of the matching text

Julian Andres Lastra Garcia Over a year ago

excuse me could you write me again the code, i don't understand yet what did you do, i appreciate it

Rodolfo Donã Hosp Over a year ago

Edited the answer

Julian Andres Lastra Garcia Over a year ago

Thanks for your answer man, but i don have yet the specific values i just want the values between #Sta/Elev and #Mann could you please chek mi editing question, if you understand better what i need, thank you

Collectives™ on Stack Overflow

How to read specific lines from text using a starting and ending condition?

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related