How to print out specific rows/lines in a text file based on a condition (greater than or less than)

Question

I am trying to code a program that prints out the specific rows/lines where one value exceeds the other one in that line. For example ,this is a small part of the text file:

01,test1,202,290,A,290

02,test2,303,730,A,0

03,test3,404,180,N,180

The program that I am trying to code would select all lines that have 'A' in them but also select the lines where the 4th column (290 for the first line) is greater then the 6th column (290 in the first line)and then print them.So the program should only print this line in the text file above in python:

02,test2,303,730,A,0

The best I can do is simply print all lines that have 'A' in them by using:

F = open("TEST.txt").read()
  for line in F.split():
    if 'A' in line:
      Column=line.split(',')

However this only selects the lines with 'A' in them ,when I attempt to filter it based on whether the 4th column is greater then the 6th column,I get various errors.Can somebody please help me with this problem?

Padraic Cunningham · Accepted Answer · 2016-02-14 20:59:38Z

1

The csv lib will parse the file into rows for you, you should also never compare numbers as strings as they will be compared lexicographically giving you incorrect output, also using in would mean you would match A in "Apple" or any other place it appear not just an exact match, if you want to check for an exact match in a particular column then you should do exactly that:

In [8]: cat test.txt
01,test1,202,290,A,290
02,test2,303,730,A,0
03,test3,404,180,N,180

In [9]: from csv import reader

In [10]: for row in reader(open("test.txt")):
           if row[4] == "A" and float(row[3]) > float(row[5]):
                  print(row)
   ....:         
['02', 'test2', '303', '730', 'A', '0']

Why comparing numbers as strings is a bad idea:

In [11]: "2" > "1234"
Out[11]: True

In [12]: float("2") > float("1234")
Out[12]: False

edited Feb 14, 2016 at 20:59

answered Feb 14, 2016 at 20:52

Padraic Cunningham

181k30 gold badges264 silver badges327 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

AlokThakur · Accepted Answer · 2016-02-13 15:50:33Z

0

You can try below code

for line in open(filename):
    if 'A' in line:
        Column=line.split(',')
        if Column[3] > Column[5]:
            print Column

answered Feb 13, 2016 at 15:50

AlokThakur

3,7611 gold badge21 silver badges33 bronze badges

Comments

MaxU - stand with Ukraine · Accepted Answer · 2016-02-13 15:51:12Z

0

Try the following code:

from __future__ import print_function

def condition(cols):
    return cols[4] == 'A' and cols[3] > cols[5]

with open('data.txt', 'r') as f:
  data = f.readlines()

[print(line) for line in data if condition(line.split(','))]

You can set any logical filtering conditions in the "condition" function

answered Feb 13, 2016 at 15:51

MaxU - stand with Ukraine

212k37 gold badges402 silver badges436 bronze badges

3 Comments

J.p Over a year ago

This is a good solution but how would you add the difference between Column[5] and Column[3] for each line?

MaxU - stand with Ukraine Over a year ago

@J.p i don't get your question/comment - do you want to print the difference between Column[5] and Column[3]?

J.p Over a year ago

Yes, for each line separately and then also as a total value (e.g. difference as 290,380,380 for separate lines and also the total difference which would be 290+380+380=1050 from each line ).

MaxU - stand with Ukraine · Accepted Answer · 2016-02-14 20:42:36Z

i guess you should definitely take a look at pandas.

It will make everything much easier:

from __future__ import print_function
import pandas as pd

df = pd.read_csv('data.txt', names=['col1','col2','col3','col4','col5','col6'])
print('Given data-set')
print(df)

df['diff'] = df['col4'] - df['col6']
flt = df[(df.col5 == 'A') & (df.col4 > df.col6)]
print('Filtered data-set')
print(flt)

#print(df.sum(axis=0, numeric_only=True))
print('sum(col6) = %d' % (df.sum(axis=0, numeric_only=True)['col6']))

Output:

Given data-set
   col1   col2  col3  col4 col5  col6
0     1  test1   202   290    A   290
1     2  test2   303   730    A     0
2     3  test3   404   180    N   180
Filtered data-set
   col1   col2  col3  col4 col5  col6  diff
1     2  test2   303   730    A     0   730
sum(col6) = 470

Collectives™ on Stack Overflow

How to print out specific rows/lines in a text file based on a condition (greater than or less than)

4 Answers 4

Comments

Comments

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related