Reading a file - python?

Question

I want to use the split string method to extract information from each line into a list.

What is the format of your file? You probably don’t want to throw away line information, which is what .read().split() will do (it splits on all whitespace). — Ry-
– Ry- ♦, Commented May 31, 2017 at 22:16
@Ryan its just a table with the last,first name then exam 1 then exam 2 (until exam 4) — Nora
– Nora, Commented May 31, 2017 at 22:20
I don't think your for-loop does what you are expecting. It will not modify file in place. It is just creating a variable line, assigning a value to it, applying the strip function and then throwing that all away and starting over on the next iteration. — Metropolis
– Metropolis, Commented May 31, 2017 at 22:24
With .split() you split on any form of whitespace. That means you loose the difference between words and lines. — dawg
– dawg, Commented May 31, 2017 at 22:33

Djaouad · Accepted Answer · 2017-05-31 22:31:44Z

1

Use splitlines, it's better :

file = open('scores.txt','r').read().splitlines()
exam_one = []
for line in file:
    line = line.split() # not strip
    exam_one.append(int(line[2])) # or better use float() since it's an exam
print(exam_one) # => [100, 82, 94, 89, 87]

edited May 31, 2017 at 22:31

answered May 31, 2017 at 22:22

Djaouad

22.8k4 gold badges37 silver badges57 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Nora Over a year ago

Thanks but how does this answer my question

Djaouad Over a year ago

By the way, exam_one = line[2], not [2::8]

dawg · Accepted Answer · 2017-05-31 23:09:26Z

Suppose you have the following string that has words (separated by horizontal whitespace) and lines (separated by \n or vertical whitespace):

>>> print(data)
Hopper, Grace 100 98 87 97
Knuth, Donald 82 87 92 81
Goldberg, Adele 94 96 90 91
Kernighan, Brian 89 74 89 77
Liskov, Barbara 87 97 81 85

If you just use .split() you loose all difference between lines and words:

>>> data.split()
['Hopper,', 'Grace', '100', '98', '87', '97', 'Knuth,', 'Donald', '82', '87', '92', '81', 'Goldberg,', 'Adele', '94', '96', '90', '91', 'Kernighan,', 'Brian', '89', '74', '89', '77', 'Liskov,', 'Barbara', '87', '97', '81', '85']

To maintain the difference, you need to combine .splitlines() with .split():

>>> [line.split() for line in data.splitlines()]
[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']]

The same concept applies to data read from files. Instead of using .splitlines() you can iterate over the individual lines of the file with a for loop:

>>> with open('/tmp/file.txt') as f:
...    for line in f:
...       print(line.split())
... 
['Hopper,', 'Grace', '100', '98', '87', '97']
['Knuth,', 'Donald', '82', '87', '92', '81']
['Goldberg,', 'Adele', '94', '96', '90', '91']
['Kernighan,', 'Brian', '89', '74', '89', '77']
['Liskov,', 'Barbara', '87', '97', '81', '85']

Or, if you want nested lists:

>>> with open('/tmp/file.txt') as f:
...    print([line.split() for line in f])
... 
[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']]

And if you want just one number from those lines:

>>> with open('/tmp/file.txt') as f:
...    print([line.split()[2] for line in f])
... 
['100', '82', '94', '89', '87']

The form of opening a file and looping over the lines with a for loop or list comprehension is considered an important Python idiom. Use those rather than reading the entire file into memory.

Ender Look · Accepted Answer · 2017-06-01 21:15:19Z

1

I don't know how is your file but I think it's something like:

Hopper, Grace 100 98 87 97
Knuth, Donald 82 87 92 81
Goldberg, Adele 94 96 90 91
Kernighan, Brian 89 74 89 77
Liskov, Barbara 87 97 81 85

Also I didn't understand what do yo want like output, but I think it's like this:

[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']]

I have developed this one-line code (for python 3.6):

with open('scores.txt', 'r') as file:
    print([[value for value in line.strip().replace(',','').split()] for line in file])

Same as:

with open('scores.txt', 'r') as file:
    tmp = []
    for line in file:
        tmp.append(line.strip().replace(',','').split())
        # Also you can delete tmp = [] and replace the tmp.append(...) line to tmp = [var for var in line.strip().replace(',','').split()]
print(tmp)

Output:

[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']]

The same as:

[
    ['Hopper,', 'Grace', '100', '98', '87', '97'],
    ['Knuth,', 'Donald', '82', '87', '92', '81'],
    ['Goldberg,', 'Adele', '94', '96', '90', '91'],
    ['Kernighan,', 'Brian', '89', '74', '89', '77'],
    ['Liskov,', 'Barbara', '87', '97', '81', '85']
]

I used like and output print() but you can define a variable is you want.

PD: I have found an easier solution:

with open('scores.txt', 'r') as file:
    print([line.split() for line in file.read().replace(',','').splitlines()])

edited Jun 1, 2017 at 21:15

answered May 31, 2017 at 22:39

Ender Look

2,4212 gold badges19 silver badges41 bronze badges

5 Comments

chepner Over a year ago

Please don't imply that writing all that in one line is in any way a good idea.

Ender Look Over a year ago

@chepner I know, therefore I also write multiples line code. Preference for the programmer.

Ender Look Over a year ago

@dawg ok, I'll try do my best, but I don't know much codding.

dawg Over a year ago

Not dv worthy, but there is not reason to first read in the entire file with file.readlines() and then iterate over that with a for loop. Just do for line in file: and let Python split the lines automatically.

Ender Look Over a year ago

@dawg, Ok thanks, I didn't know about that, I will repair my code.

chepner · Accepted Answer · 2017-05-31 22:27:53Z

0

Don't read the entire file into memory first. File objects are iterators.

result = []
with open('scores.txt') as f:
    for line in f:
         # E.g., fields == ['Hopper,', 'Grace', '100', '98', '87', '97']
        fields = line.strip().split()

It's not clear what you want as an end result; the first grade of each line, perhaps? After splitting the line, you could get that with

result.append(fields[2])

answered May 31, 2017 at 22:27

chepner

538k77 gold badges594 silver badges746 bronze badges

4 Comments

Nora Over a year ago

I wrote a sample result at the end of my comment.. I want to create a list containing the contents of each column

chepner Over a year ago

That list contains the contents of the first column, as I assumed in the question. It does not contain the contents of each (or every) column.

Nora Over a year ago

Also when I use your code it only prints the last line.

chepner Over a year ago

My code doesn't print anything, so I don't know what you are doing.

Collectives™ on Stack Overflow

Reading a file - python?

4 Answers 4

2 Comments

Comments

5 Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

Comments

5 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related