3

I am trying to code a program that prints out the specific rows/lines where one value exceeds the other one in that line. For example ,this is a small part of the text file:

01,test1,202,290,A,290

02,test2,303,730,A,0

03,test3,404,180,N,180

The program that I am trying to code would select all lines that have 'A' in them but also select the lines where the 4th column (290 for the first line) is greater then the 6th column (290 in the first line)and then print them.So the program should only print this line in the text file above in python:

02,test2,303,730,A,0

The best I can do is simply print all lines that have 'A' in them by using:

F = open("TEST.txt").read()
  for line in F.split():
    if 'A' in line:
      Column=line.split(',')

However this only selects the lines with 'A' in them ,when I attempt to filter it based on whether the 4th column is greater then the 6th column,I get various errors.Can somebody please help me with this problem?

4 Answers 4

1

The csv lib will parse the file into rows for you, you should also never compare numbers as strings as they will be compared lexicographically giving you incorrect output, also using in would mean you would match A in "Apple" or any other place it appear not just an exact match, if you want to check for an exact match in a particular column then you should do exactly that:

In [8]: cat test.txt
01,test1,202,290,A,290
02,test2,303,730,A,0
03,test3,404,180,N,180

In [9]: from csv import reader

In [10]: for row in reader(open("test.txt")):
           if row[4] == "A" and float(row[3]) > float(row[5]):
                  print(row)
   ....:         
['02', 'test2', '303', '730', 'A', '0']

Why comparing numbers as strings is a bad idea:

In [11]: "2" > "1234"
Out[11]: True

In [12]: float("2") > float("1234")
Out[12]: False
Sign up to request clarification or add additional context in comments.

Comments

0

You can try below code

for line in open(filename):
    if 'A' in line:
        Column=line.split(',')
        if Column[3] > Column[5]:
            print Column

Comments

0

Try the following code:

from __future__ import print_function

def condition(cols):
    return cols[4] == 'A' and cols[3] > cols[5]

with open('data.txt', 'r') as f:
  data = f.readlines()

[print(line) for line in data if condition(line.split(','))]

You can set any logical filtering conditions in the "condition" function

3 Comments

This is a good solution but how would you add the difference between Column[5] and Column[3] for each line?
@J.p i don't get your question/comment - do you want to print the difference between Column[5] and Column[3]?
Yes, for each line separately and then also as a total value (e.g. difference as 290,380,380 for separate lines and also the total difference which would be 290+380+380=1050 from each line ).
0

i guess you should definitely take a look at pandas.

It will make everything much easier:

from __future__ import print_function
import pandas as pd

df = pd.read_csv('data.txt', names=['col1','col2','col3','col4','col5','col6'])
print('Given data-set')
print(df)

df['diff'] = df['col4'] - df['col6']
flt = df[(df.col5 == 'A') & (df.col4 > df.col6)]
print('Filtered data-set')
print(flt)

#print(df.sum(axis=0, numeric_only=True))
print('sum(col6) = %d' % (df.sum(axis=0, numeric_only=True)['col6']))

Output:

Given data-set
   col1   col2  col3  col4 col5  col6
0     1  test1   202   290    A   290
1     2  test2   303   730    A     0
2     3  test3   404   180    N   180
Filtered data-set
   col1   col2  col3  col4 col5  col6  diff
1     2  test2   303   730    A     0   730
sum(col6) = 470

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.