0

I want to use the split string method to extract information from each line into a list.

6
  • 1
    What is the format of your file? You probably don’t want to throw away line information, which is what .read().split() will do (it splits on all whitespace). Commented May 31, 2017 at 22:16
  • .readlines() is probably better Commented May 31, 2017 at 22:17
  • @Ryan its just a table with the last,first name then exam 1 then exam 2 (until exam 4) Commented May 31, 2017 at 22:20
  • I don't think your for-loop does what you are expecting. It will not modify file in place. It is just creating a variable line, assigning a value to it, applying the strip function and then throwing that all away and starting over on the next iteration. Commented May 31, 2017 at 22:24
  • With .split() you split on any form of whitespace. That means you loose the difference between words and lines. Commented May 31, 2017 at 22:33

4 Answers 4

1

Use splitlines, it's better :

file = open('scores.txt','r').read().splitlines()
exam_one = []
for line in file:
    line = line.split() # not strip
    exam_one.append(int(line[2])) # or better use float() since it's an exam
print(exam_one) # => [100, 82, 94, 89, 87]
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks but how does this answer my question
By the way, exam_one = line[2], not [2::8]
1

Suppose you have the following string that has words (separated by horizontal whitespace) and lines (separated by \n or vertical whitespace):

>>> print(data)
Hopper, Grace 100 98 87 97
Knuth, Donald 82 87 92 81
Goldberg, Adele 94 96 90 91
Kernighan, Brian 89 74 89 77
Liskov, Barbara 87 97 81 85

If you just use .split() you loose all difference between lines and words:

>>> data.split()
['Hopper,', 'Grace', '100', '98', '87', '97', 'Knuth,', 'Donald', '82', '87', '92', '81', 'Goldberg,', 'Adele', '94', '96', '90', '91', 'Kernighan,', 'Brian', '89', '74', '89', '77', 'Liskov,', 'Barbara', '87', '97', '81', '85']

To maintain the difference, you need to combine .splitlines() with .split():

>>> [line.split() for line in data.splitlines()]
[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']]

The same concept applies to data read from files. Instead of using .splitlines() you can iterate over the individual lines of the file with a for loop:

>>> with open('/tmp/file.txt') as f:
...    for line in f:
...       print(line.split())
... 
['Hopper,', 'Grace', '100', '98', '87', '97']
['Knuth,', 'Donald', '82', '87', '92', '81']
['Goldberg,', 'Adele', '94', '96', '90', '91']
['Kernighan,', 'Brian', '89', '74', '89', '77']
['Liskov,', 'Barbara', '87', '97', '81', '85']

Or, if you want nested lists:

>>> with open('/tmp/file.txt') as f:
...    print([line.split() for line in f])
... 
[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']]

And if you want just one number from those lines:

>>> with open('/tmp/file.txt') as f:
...    print([line.split()[2] for line in f])
... 
['100', '82', '94', '89', '87']

The form of opening a file and looping over the lines with a for loop or list comprehension is considered an important Python idiom. Use those rather than reading the entire file into memory.

Comments

1

I don't know how is your file but I think it's something like:

Hopper, Grace 100 98 87 97
Knuth, Donald 82 87 92 81
Goldberg, Adele 94 96 90 91
Kernighan, Brian 89 74 89 77
Liskov, Barbara 87 97 81 85

Also I didn't understand what do yo want like output, but I think it's like this:

[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']]

I have developed this one-line code (for python 3.6):

with open('scores.txt', 'r') as file:
    print([[value for value in line.strip().replace(',','').split()] for line in file])

Same as:

with open('scores.txt', 'r') as file:
    tmp = []
    for line in file:
        tmp.append(line.strip().replace(',','').split())
        # Also you can delete tmp = [] and replace the tmp.append(...) line to tmp = [var for var in line.strip().replace(',','').split()]
print(tmp)

Output:

[['Hopper,', 'Grace', '100', '98', '87', '97'], ['Knuth,', 'Donald', '82', '87', '92', '81'], ['Goldberg,', 'Adele', '94', '96', '90', '91'], ['Kernighan,', 'Brian', '89', '74', '89', '77'], ['Liskov,', 'Barbara', '87', '97', '81', '85']]

The same as:

[
    ['Hopper,', 'Grace', '100', '98', '87', '97'],
    ['Knuth,', 'Donald', '82', '87', '92', '81'],
    ['Goldberg,', 'Adele', '94', '96', '90', '91'],
    ['Kernighan,', 'Brian', '89', '74', '89', '77'],
    ['Liskov,', 'Barbara', '87', '97', '81', '85']
]

I used like and output print() but you can define a variable is you want.

PD: I have found an easier solution:

with open('scores.txt', 'r') as file:
    print([line.split() for line in file.read().replace(',','').splitlines()])

5 Comments

Please don't imply that writing all that in one line is in any way a good idea.
@chepner I know, therefore I also write multiples line code. Preference for the programmer.
@dawg ok, I'll try do my best, but I don't know much codding.
Not dv worthy, but there is not reason to first read in the entire file with file.readlines() and then iterate over that with a for loop. Just do for line in file: and let Python split the lines automatically.
@dawg, Ok thanks, I didn't know about that, I will repair my code.
0

Don't read the entire file into memory first. File objects are iterators.

result = []
with open('scores.txt') as f:
    for line in f:
         # E.g., fields == ['Hopper,', 'Grace', '100', '98', '87', '97']
        fields = line.strip().split() 

It's not clear what you want as an end result; the first grade of each line, perhaps? After splitting the line, you could get that with

result.append(fields[2])

4 Comments

I wrote a sample result at the end of my comment.. I want to create a list containing the contents of each column
That list contains the contents of the first column, as I assumed in the question. It does not contain the contents of each (or every) column.
Also when I use your code it only prints the last line.
My code doesn't print anything, so I don't know what you are doing.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.