0

I want to convert specific strings to floats with the csv read file, but cant figure it out.

with open('diamonds_testing.csv') as csv_file:
    csv_reader = csv.reader(csv_file)
    diamond = list(csv_reader)

print(diamond[1])
print(diamond[2])

output:

['0.23', 'Ideal', 'E', 'SI2', '55', '3.95', '3.98', '2.43']
         
['0.31', 'Very Good', 'J', 'SI1', '62', '4.39', '4.43', '2.62']

I want to output to be floats in columns 0,4,5,6,7. Thank you.

1
  • Does this have anything related with pandas? I think pandas will infer value type properly. Commented Apr 30, 2021 at 9:50

3 Answers 3

1
    with open('diamonds_testing.csv') as csv_file:
    csv_reader = csv.reader(csv_file)
    diamond = list(csv_reader)
    lines = []
    for line in diamond:
        for i in [0, 4, 5, 6, 7]:
            line[i] = float(line[i])
        lines.append(line)
    print(lines)

float () to do cast. to pass line by line use

for line in diamond:
Sign up to request clarification or add additional context in comments.

2 Comments

ValueError: could not convert string to float: 'carat'
are you sure that all the csv files are always organized that they have numbers as strings in 0,4,5,6,7 columns? Try to Debug and catch noisy row that throws that error.
0

The csv module is a lower level module (and far less resource consuming) that the huge pandas library. As a result, it never tries to interpret the data and only split rows into strings. If you know that some columns should contain integer or floating point values, you will have to convert the values yourself.

with open('diamonds_testing.csv') as csv_file:
    float_columns = (O, 4, 5, 6, 6)
    csv_reader = csv.reader(csv_file)
    diamond = [[float(v) if i in float_columns else v for i, v in enumerate(row)]
               for row in csv_reader]

Or you can have Pandas to guess the data types:

diamond = [row[1].to_list()
           for row in pd.read_csv('diamonds_testing.csv', header=None).iterrows()]

2 Comments

1 frames <ipython-input-26-755a1c6fed7d> in <listcomp>(.0) 2 float_columns = (0, 4, 5, 6, 7) 3 csv_reader = csv.reader(csv_file) ----> 4 diamond = [[float(v) if i in float_columns else v for i, v in enumerate(row)] 5 for row in csv_reader] ValueError: could not convert string to float: 'carat'
@Joseph You said you wanted to convert this column to float. So I just expected it to only contain numbers.
0

you can try

def validate(num):
    try:
        return int(num)
    except (ValueError, TypeError):
        try:
            return float(num)
        except (ValueError, TypeError):
            return num
with open('diamonds_testing.csv') as csv_file:
    csv_reader = csv.reader(csv_file)
    diamond = list(csv_reader)

    for i in diamond:
        diamond[i] = [validate(v) for v in diamond[i]]

Sample output

print(diamond[1])
print(diamond[2])

Output:

[0.23, 'Ideal', 'E', 'SI2', 55, 3.95, 3.98, 2.43]
[0.31, 'Very Good', 'J', 'SI1', 62, 4.39, 4.43, 2.62]

2 Comments

I get the error: TypeError Traceback (most recent call last) <ipython-input-23-17d005a3a666> in <module>() 4 5 for i in diamond: ----> 6 diamond[i] = [validate(v) for v in diamond[i]] TypeError: list indices must be integers or slices, not list
Can you add diamond value to the question?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.