Reading specific columns from a text file in python

Question

I have a text file which contains a table comprised of numbers e.g:

5 10 6

6 20 1

7 30 4

8 40 3

9 23 1

4 13 6

if for example I want the numbers contained only in the second column, how do i extract that column into a list?

user3804598 · Accepted Answer · 2020-09-14 18:17:12Z

40

f=open(file,"r")
lines=f.readlines()
result=[]
for x in lines:
    result.append(x.split(' ')[1])
f.close()

You can do the same using a list comprehension

print([x.split(' ')[1] for x in open(file).readlines()])

Docs on split()

string.split(s[, sep[, maxsplit]])

Return a list of the words of the string s. If the optional second argument sep is absent or None, the words are separated by arbitrary strings of whitespace characters (space, tab, newline, return, formfeed). If the second argument sep is present and not None, it specifies a string to be used as the word separator. The returned list will then have one more item than the number of non-overlapping occurrences of the separator in the string.

So, you can omit the space I used and do just x.split() but this will also remove tabs and newlines, be aware of that.

edited Sep 14, 2020 at 18:17

user3804598

4557 silver badges9 bronze badges

answered May 13, 2015 at 13:39

ForceBru

45k10 gold badges72 silver badges104 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

ForceBru Over a year ago

@StefanPochmann, they're dealing with large files there. Here it's used just for clarity. Honestly, I wouldn't trust a site called _stupid_pythonideas :)

Stefan Pochmann Over a year ago

It's not just about large files. And of course your x is very good for clarity.

Adam Smith Over a year ago

Agreed with both commenters. file.readlines should generally be avoided because there's rarely a good reason to build a list from an iterable unless you need it more than once (which you don't in this case). However it's worth mentioning that my answer does effectively the same thing, and isn't drawing criticism. Ultimately @StefanPochmann 's comment is knee-jerk and unhelpful. Most times there will be negligible difference between for line in f and for line in f.readlines().

Adam Smith Over a year ago

In this particular case, of course, we're not even using that list. Simple removing the line lines = f.readlines() and iterating for x in f works exactly the same way (barring some trailing whitespace since you should be doing x.split() not x.split(' ')). It's a negligible difference, but there's no benefit whatsoever.

Stefan Pochmann Over a year ago

@AdamSmith Well there's the advantage of not doing something generally bad for no good reason whatsoever and advocating it to others who then do the same thing. Also, I think there's definitely a clarity benefit in for line in file (does it get any more natural?) compared to for x in file.readlines(). And I don't see how your answer is comparable. Because of the efficiency issue? That's not why I complained. But even if it were - you're doing it as necessary part of your approach. Here, on the other hand, it serves absolutely no purpose.

|

aerobiomat · Accepted Answer · 2018-12-28 09:19:53Z

16

I know this is an old question, but nobody mentioned that when your data looks like an array, numpy's loadtxt comes in handy:

>>> import numpy as np
>>> np.loadtxt("myfile.txt")[:, 1]
array([10., 20., 30., 40., 23., 13.])

answered Dec 28, 2018 at 9:19

aerobiomat

3,4371 gold badge18 silver badges22 bronze badges

Comments

Adam Smith · Accepted Answer · 2015-05-13 14:11:55Z

You have a space delimited file, so use the module designed for reading delimited values files, csv.

import csv

with open('path/to/file.txt') as inf:
    reader = csv.reader(inf, delimiter=" ")
    second_col = list(zip(*reader))[1]
    # In Python2, you can omit the `list(...)` cast

The zip(*iterable) pattern is useful for converting rows to columns or vice versa. If you're reading a file row-wise...

>>> testdata = [[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]]

>>> for line in testdata:
...     print(line)

[1, 2, 3]
[4, 5, 6]
[7, 8, 9]

...but need columns, you can pass each row to the zip function

>>> testdata_columns = zip(*testdata)
# this is equivalent to zip([1,2,3], [4,5,6], [7,8,9])

>>> for line in testdata_columns:
...     print(line)

[1, 4, 7]
[2, 5, 8]
[3, 6, 9]

Kasravnd · Accepted Answer · 2015-05-13 13:43:22Z

6

You can use a zip function with a list comprehension :

with open('ex.txt') as f:
    print zip(*[line.split() for line in f])[1]

result :

('10', '20', '30', '40', '23', '13')

answered May 13, 2015 at 13:43

Kasravnd

108k19 gold badges167 silver badges195 bronze badges

Comments

ZdaR · Accepted Answer · 2015-05-13 13:49:06Z

4

First of all we open the file and as datafile then we apply .read() method reads the file contents and then we split the data which returns something like: ['5', '10', '6', '6', '20', '1', '7', '30', '4', '8', '40', '3', '9', '23', '1', '4', '13', '6'] and the we applied list slicing on this list to start from the element at index position 1 and skip next 3 elements untill it hits the end of the loop.

with open("sample.txt", "r") as datafile:
    print datafile.read().split()[1::3]

Output:

['10', '20', '30', '40', '23', '13']

edited May 13, 2015 at 13:49

answered May 13, 2015 at 13:43

ZdaR

23.1k7 gold badges71 silver badges90 bronze badges

Comments

StephanSchrodinger · Accepted Answer · 2019-02-14 20:59:33Z

0

It may help:

import csv
with open('csv_file','r') as f:
    # Printing Specific Part of CSV_file
    # Printing last line of second column
    lines = list(csv.reader(f, delimiter = ' ', skipinitialspace = True))
    print(lines[-1][1])
    # For printing a range of rows except 10 last rows of second column
    for i in range(len(lines)-10):
        print(lines[i][1])

answered Feb 14, 2019 at 20:59

StephanSchrodinger

92 bronze badges

Collectives™ on Stack Overflow

Reading specific columns from a text file in python

6 Answers 6

7 Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

7 Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related