14

I'm looking for the easiest way to convert all non-numeric data (including blanks) in Python to zeros. Taking the following for example:

someData = [[1.0,4,'7',-50],['8 bananas','text','',12.5644]]

I would like the output to be as follows:

desiredData = [[1.0,4,7,-50],[0,0,0,12.5644]]

So '7' should be 7, but '8 bananas' should be converted to 0.

1
  • And for numeric types you do not want the type to change , i mean like int to convert to float or vice versa , it would be easier if you were aiming for a single type (rather than numeric types) . Commented Sep 20, 2015 at 14:51

9 Answers 9

13
import numbers
def mapped(x):
    if isinstance(x,numbers.Number):
        return x
    for tpe in (int, float):
        try:
            return tpe(x)
        except ValueError:
            continue
    return 0
for sub  in someData:
    sub[:] = map(mapped,sub)

print(someData)
[[1.0, 4, 7, -50], [0, 0, 0, 12.5644]]

It will work for different numeric types:

In [4]: from decimal import Decimal

In [5]: someData = [[1.0,4,'7',-50 ,"99", Decimal("1.5")],["foobar",'8 bananas','text','',12.5644]]

In [6]: for sub in someData:
   ...:         sub[:] = map(mapped,sub)
   ...:     

In [7]: someData
Out[7]: [[1.0, 4, 7, -50, 99, Decimal('1.5')], [0, 0, 0, 0, 12.5644]]

if isinstance(x,numbers.Number) catches subelements that are already floats, ints etc.. if it is not a numeric type we first try casting to int then to float, if none of those are successful we simply return 0.

Sign up to request clarification or add additional context in comments.

Comments

5

Another solution using regular expressions

import re

def toNumber(e):
    if type(e) != str:
        return e
    if re.match("^-?\d+?\.\d+?$", e):
        return float(e)
    if re.match("^-?\d+?$", e):
        return int(e)
    return 0

someData = [[1.0,4,'7',-50],['8 bananas','text','',12.5644]]
someData = [map(toNumber, list) for list in someData]
print(someData)

you get:

[[1.0, 4, 7, -50], [0, 0, 0, 12.5644]]

Note It don't works for numbers in scientific notation

Comments

1

Considering you need both int and float data types, you should try the following code:

desired_data = []
for sub_list in someData:
    desired_sublist = []
    for element in sub_list:
        try:
            some_element = eval(element)
            desired_sublist.append(some_element)
        except:
            desired_sublist.append(0)
    desired_data.append(desired_sublist) 

This might not be the optimal way to do it, but still it does the job that you asked for.

Comments

1
lists = [[1.0,4,'7',-50], ['1', 4.0, 'banana', 3, "12.6432"]]
nlists = []
for lst in lists:
    nlst = []
    for e in lst:
        # Check if number can be a float
        if '.' in str(e):
            try:
                n = float(e)
            except ValueError:
                n = 0
        else:
            try:
                n = int(e)
            except ValueError:
                n = 0

        nlst.append(n)
    nlists.append(nlst)

print(nlists)

Comments

1

Not surprisingly, Python has a way to check if something is a number:

import collections
import numbers
def num(x):
    try:
        return int(x)
    except ValueError:
        try:
            return float(x)
        except ValueError:
            return 0

def zeronize(data):
    return [zeronize(x) if isinstance(x, collections.Sequence) and not isinstance(x, basestring) else num(x) for x in data]

someData = [[1.0,4,'7',-50],['8 bananas','text','',12.5644]]
desiredData = zeronize(someData)


desiredData = `[[1, 4, 7, -50], [0, 0, 0, 12]]`

A function is defined in case you have nested lists of arbitrary depth. If using Python 3.x, replace basestring with str.

This this and this question may be relevant. Also, this and this.

Comments

1

As an alternative, you can use the decimal module within a nested list comprehension:

>>> [[Decimal(i) if (isinstance(i,str) and i.isdigit()) or isinstance(i,(int,float)) else 0 for i in j] for j in someData]
[[Decimal('1'), Decimal('4'), Decimal('7'), Decimal('-50')], [0, 0, 0, Decimal('12.56439999999999912461134954')]]

Note that the advantage of Decimal is that under the first condition you can use it to get a decimal value for a digit string and a float representation for a float and integer for int:

>>> Decimal('7')+3
Decimal('10')

Comments

1

Integers, floats, and negative numbers in quotes are fine:

 def is_number(s):
        try:
            float(s)
            return True
        except ValueError:
            return False

def is_int(s):
    try:
        int(s)
        return True
    except ValueError:
        return False

someData = [[1.0,4,'7',-50, '12.333', '-90'],['-333.90','8 bananas','text','',12.5644]]

 for l in someData:
        for i, el in enumerate(l):
            if isinstance(el, str) and not is_number(el):

                l[i] = 0
           elif isinstance(el, str) and is_int(el):

                l[i] = int(el)
           elif isinstance(el, str) and is_number(el):

                l[i] = float(el)

print(someData)

Output:

[[1.0, 4, 7, -50, 12.333, -90], [-333.9, 0, 0, 0, 12.5644]]

2 Comments

I like the simplicity of this approach, but it converts '7' to 0 instead of 7.
@user1882017, thanks i missed that '7... added isdigit(0) check
1

A one-liner:

import re
result = [[0 if not re.match("^(\d+(\.\d*)?)$|^(\.\d+)$", str(s)) else float(str(s)) if not str(s).isdigit() else int(str(s)) for s in xs] for xs in somedata]
>>> result
[[1.0, 4, 7, 0], [0, 0, 0, 12.5644]]

Comments

0

I assume the blanks you are referring to are empty strings. Since you want to convert all strings, regardless of them containing characters or not. We can simply check if the type of an object is a string. If it is, we can convert it to the integer 0.

cleaned_data = []
for array in someData:
    for item in array:
        cleaned_data.append(0 if type(item) == str else item)

>>>cleaned_data
[1.0, 4, 0, -50, 0, 0, 0, 12.5644]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.