1

I am using an API that returns what appears to be a CSV string that i need to parse for two decimal numbers and then need to append those numbers to separate lists as decimal numbers (also while ignoring the timestamp at the end):

returned_string_from_API = '0,F,F,1.139520,1.139720,0,0,20160608163132000'
decimal_lowest_in_string = []
decimal_highest_in_string = []

Processing time is a factor in this situation so, what is the fastest way to accomplish this?

1
  • I don't think the csv module is right here, I'd just use str.split(',') Commented Jun 9, 2016 at 8:34

3 Answers 3

2

Split the string by comma:

>>> string_values = returned_string_from_API.split(',')
>>> string_values
['0', 'F', 'F', '1.139520', '1.139720', '0', '0', '20160608163132000']

Get the values from string:

>>> string_values[3:5]
['1.139520', '1.139720']

Convert to float:

>>> decimal_values = [float(val) for val in string_values[3:5]]
>>> decimal_values
[1.13952, 1.13972]

Get min and max in the appropriate list:

>>> decimal_lowest_in_string = []
>>> decimal_highest_in_string = []
>>> decimal_lowest_in_string.append(min(decimal_values))
>>> decimal_lowest_in_string
[1.13952]
>>> decimal_highest_in_string.append(max(decimal_values))
>>> decimal_highest_in_string
[1.13972]
Sign up to request clarification or add additional context in comments.

2 Comments

That is exactly what I was looking for, thanks Doron!
You welcome. I think there some ways to make it faster but that requires some more information about the format. If the length of the values in the strings are constant you can use substring to extract the 1.139520,1.139720 part and split it. Thus not having to process the whole input.
1

1) The version which does not rely on cvs

returned_string_from_API = '0,F,F,1.139520,1.139720,0,0,20160608163132000'

def isfloat(value):
  try:
    float(value)
    return True
  except ValueError:
    return False

float_numbers = filter(isfloat, returned_string_from_API.split(','))

2) try pandas package

Comments

1

Fastest way is to use regular expression. Readability is another issue..

import re

returned_string_from_API = '0,F,F,1.139520,1.139720,0,0,20160608163132000'
decimal_lowest_in_string = []
decimal_highest_in_string = []

re_check = re.compile(r"[0-9]+\.\d*")
m = re_check.findall(returned_string_from_API)

decimal_lowest_in_string.append(min(m))
decimal_highest_in_string.append(max(m))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.