I want to use the split string method to extract information from each line into a list.
4 Answers
Use splitlines, it's better :
file = open('scores.txt','r').read().splitlines()
exam_one = []
for line in file:
line = line.split() # not strip
exam_one.append(int(line[2])) # or better use float() since it's an exam
print(exam_one) # => [100, 82, 94, 89, 87]
Suppose you have the following string that has words (separated by horizontal whitespace) and lines (separated by \n or vertical whitespace):
>>> print(data)
Hopper, Grace 100 98 87 97
Knuth, Donald 82 87 92 81
Goldberg, Adele 94 96 90 91
Kernighan, Brian 89 74 89 77
Liskov, Barbara 87 97 81 85
If you just use .split() you loose all difference between lines and words:
>>> data.split()
['Hopper,', 'Grace', '100', '98', '87', '97', 'Knuth,', 'Donald', '82', '87', '92', '81', 'Goldberg,', 'Adele', '94', '96', '90', '91', 'Kernighan,', 'Brian', '89', '74', '89', '77', 'Liskov,', 'Barbara', '87', '97', '81', '85']
To maintain the difference, you need to combine .splitlines() with .split():
>>> [line.split() for line in data.splitlines()]
[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']]
The same concept applies to data read from files. Instead of using .splitlines() you can iterate over the individual lines of the file with a for loop:
>>> with open('/tmp/file.txt') as f:
... for line in f:
... print(line.split())
...
['Hopper,', 'Grace', '100', '98', '87', '97']
['Knuth,', 'Donald', '82', '87', '92', '81']
['Goldberg,', 'Adele', '94', '96', '90', '91']
['Kernighan,', 'Brian', '89', '74', '89', '77']
['Liskov,', 'Barbara', '87', '97', '81', '85']
Or, if you want nested lists:
>>> with open('/tmp/file.txt') as f:
... print([line.split() for line in f])
...
[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']]
And if you want just one number from those lines:
>>> with open('/tmp/file.txt') as f:
... print([line.split()[2] for line in f])
...
['100', '82', '94', '89', '87']
The form of opening a file and looping over the lines with a for loop or list comprehension is considered an important Python idiom. Use those rather than reading the entire file into memory.
Comments
I don't know how is your file but I think it's something like:
Hopper, Grace 100 98 87 97
Knuth, Donald 82 87 92 81
Goldberg, Adele 94 96 90 91
Kernighan, Brian 89 74 89 77
Liskov, Barbara 87 97 81 85
Also I didn't understand what do yo want like output, but I think it's like this:
[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']]
I have developed this one-line code (for python 3.6):
with open('scores.txt', 'r') as file:
print([[value for value in line.strip().replace(',','').split()] for line in file])
Same as:
with open('scores.txt', 'r') as file:
tmp = []
for line in file:
tmp.append(line.strip().replace(',','').split())
# Also you can delete tmp = [] and replace the tmp.append(...) line to tmp = [var for var in line.strip().replace(',','').split()]
print(tmp)
Output:
[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']]
The same as:
[
['Hopper,', 'Grace', '100', '98', '87', '97'],
['Knuth,', 'Donald', '82', '87', '92', '81'],
['Goldberg,', 'Adele', '94', '96', '90', '91'],
['Kernighan,', 'Brian', '89', '74', '89', '77'],
['Liskov,', 'Barbara', '87', '97', '81', '85']
]
I used like and output print() but you can define a variable is you want.
PD: I have found an easier solution:
with open('scores.txt', 'r') as file:
print([line.split() for line in file.read().replace(',','').splitlines()])
5 Comments
file.readlines() and then iterate over that with a for loop. Just do for line in file: and let Python split the lines automatically.Don't read the entire file into memory first. File objects are iterators.
result = []
with open('scores.txt') as f:
for line in f:
# E.g., fields == ['Hopper,', 'Grace', '100', '98', '87', '97']
fields = line.strip().split()
It's not clear what you want as an end result; the first grade of each line, perhaps? After splitting the line, you could get that with
result.append(fields[2])
.read().split()will do (it splits on all whitespace).filein place. It is just creating a variableline, assigning a value to it, applying thestripfunction and then throwing that all away and starting over on the next iteration..split()you split on any form of whitespace. That means you loose the difference between words and lines.