1

I am reading in a text file into python. The text file contains a (large) number of variables with the variable name given as a string on the left and the value on the right, separated by an equals sign (=). For example

Proc_Method = 2
Obs_Method = 4

So long as the value of the variable is given in a single line I am able to read out the value of the variable correctly with:

    namevalue = data.split('=') 
    name = namevalue[0].strip()
    value = namevalue[1].strip()

However, if the variable is spread over multiple lines (i.e. an array). This code only assigns the FIRST row of the array to the variable before moving on to the next variable. So if I had a variable of the following form:

Corr_Mat = 10 0 0 
           20 10 0
           0 20 10  

the above code would state that value equaled 10 0 0 and then move on to the next variable. Is there a way I can define value so that it takes ALL the lines starting with the equal sign, and finishes at the line before the next equality in the text file?

4
  • Are the values always integers? Commented Jul 8, 2022 at 21:58
  • Why don't you check if the line contains = if not consider the line as the value of the previous key? Commented Jul 8, 2022 at 21:59
  • 1
    Ah no the values don't have to be integers, it was just the quickest way to describe the issue. They don't even have to be numbers, some of the variables are strings of path names leading to other text files. But these are all on one line of the text file and I am able to assign them to "value" just fine. It is only the arrays that are giving me trouble Commented Jul 8, 2022 at 22:00
  • 1
    @PraveenPremaratne ah something like an if statement? Is there a natural way to say if (line doesn't contain '=') then value = line+line? Commented Jul 8, 2022 at 22:08

3 Answers 3

1

With a file like this:

Proc_Method = 2
Obs_Method = 4
Corr_Mat = 10 0 0 
           20 10 0
           0 20 10
Proc_Method = 2

Option 1

You can follow a stack-like approach that puts new variables as keys into the dictionary variables and appends all following lines as value to the key.

variables = {}

with open('file.txt') as f:
    for line in f:
        if '=' in line:
            # add a new key-value pair to the dictionary,
            # remember the current variable name
            current_variable, value = line.split('=')
            variables[current_variable] = value
        else:
            # it's a continued line -> add to the remembered key
            variables[current_variable] += line

Result:

{'Proc_Method ': ' 2\n',
 'Obs_Method ': ' 4\n',
 'Corr_Mat ': ' 10 0 0 \n           20 10 0\n           0 20 10\n'}

Option 2

Alternatively, you can read the file as a whole and use a regular expression to extract the variables.

The pattern searches for the start of a line (leading ^ in combination with re.MULTILINE), followed by an arbitrary number of symbols (.+) followed by = followed by an arbitrary number of symbols that are not equal signs ([^=]+) followed by a newline.

import re 

with open('file.txt') as f:
    file_content = f.read()

chunks = re.findall(r'^.+=[^=]+\n', file_content, flags=re.MULTILINE)

Result:

['Proc_Method = 2\n',
 'Obs_Method = 4\n',
 'Corr_Mat = 10 0 0 \n           20 10 0\n           0 20 10\n',
 'Proc_Method = 2\n']

Of course that needs some cleaning and only works if variables don't contain any =s which might not be guaranteed.

Sign up to request clarification or add additional context in comments.

4 Comments

Your regex method stops parsing as soon as it hits a multiline variable. Tested it with python 3.8.
@anishtain4 I don't see why that would happen, other than if the final variable has no trailing newline. What data did you test it on? Granted the re approach is probably not robust enough for real data.
I just removed Proc_Method = 2\n from the first line of your data and ran it (no empty lines). Running with your example doesn't capture the last line of your results either.
@anishtain4 I am assuming a \n at the end.
0

Try this code.

params = {}

last_param = ''

for line in data.splitlines():
    
    line = line.strip()
    
    if '=' in line:
        sp = line.split('=')
        last_param = sp[0]
        params[last_param] = sp[1]
    else:
        params[last_param] += f'\n{line}'

print(params)

Result:

{'Proc_Method ': ' 2', 'Obs_Method ': ' 4', 'Corr_Mat ': ' 10 0 0\n20 10 0\n0 20 10'}

Comments

-1

I got a cleaner result with this but arguably messier code. But I guess there are many easy ways to make this code cleaner...

File Input:

Proc_Method = 2
Obs_Method = 4
Corr_Mat = 10 0 0 
           20 10 0
           0 20 10
test = 4
test2 = 4
woof = a b c 
           s sdfasd sdas 
           sda as a 

Code:

output = {}
with open ("E:\Coding\stackoverflow\input.txt", "r") as file:
    lines = file.readlines()
    previous_full_line = 0
    for line_number, line in enumerate(lines):
        line_content = line.strip()
        if "=" in line_content:
            namevalue = line_content.split('=') 
            name = namevalue[0].strip()
            values = namevalue[1].strip()
            values = values.split(" ")
            line_list = []
            for value in values:
                line_list.append(value)
            output[name] = line_list
            previous_full_line = line_number
        else:
            values = line_content.split(" ")
            new_list = []
            for value in values:
                new_list.append(value)
            output[name].extend(new_list)

print(output)

Result

{
   "Proc_Method":[
      "2"
   ],
   "Obs_Method":[
      "4"
   ],
   "Corr_Mat":[
      "10",
      "0",
      "0",
      "20",
      "10",
      "0",
      "0",
      "20",
      "10"
   ],
   "test":[
      "4"
   ],
   "test2":[
      "4"
   ],
   "woof":[
      "a",
      "b",
      "c",
      "s",
      "sdfasd",
      "sdas",
      "sda",
      "as",
      "a"
   ]
}

From this output result I guess you can use the data however you want. Knowing that all the attributes under and "key" in the dictionary "output" are in a single array.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.