2

I am trying to read a data file written by a Fortran program, in which every once in a while there is a very small float like 0.3299880-104. The error message is:

>np.loadtxt(filename, usecols = (1,))

  File "/home/anaconda2/lib/python2.7/site-packages/numpy/lib/npyio.py", line 928, in loadtxt
    items = [conv(val) for (conv, val) in zip(converters, vals)]

  File "/home/anaconda2/lib/python2.7/site-packages/numpy/lib/npyio.py", line 659, in floatconv
    return float(x)

ValueError: invalid literal for float(): 0.3299880-104

Can I do something to make Numpy able to read this data file anyway?

6
  • Are you sure the number you're trying to read is 0.3299880e-104 and not just 0.3299880-104 ? Commented Dec 31, 2015 at 10:37
  • 2
    Personally I'd use a utility (I like sed) to modify numbers from 0.3299880-104 to 0.3299880e-104. I believe that Python itself can do that sort of thing, so you might want to write a routine to massage the file before reading it. Commented Dec 31, 2015 at 10:44
  • 1
    Possibly useful: stackoverflow.com/q/13274066. Commented Dec 31, 2015 at 10:47
  • 1
    @Shark - the number I am trying to read is without the e, just - 0.3299880-104 Commented Dec 31, 2015 at 11:07
  • 1
    I'm inclined to agree. It looks like you might develop a converters dictionary for loadtxt to handle this. I'd suggest you put a little work into figuring that out and then pose a new more specific question. (Or a new answer to the linked question if you do figure it out) Commented Dec 31, 2015 at 16:13

3 Answers 3

3

As @agentp mentioned in the comments, one approach would be to use the converters= argument to np.genfromtxt to insert the e characters before casting to float:

import numpy as np

# some example strings
strings = "0.3299880-104 0.3299880+104 0.3299880"

# create a "dummy file" (see http://stackoverflow.com/a/11970414/1461210)
try:
    from StringIO import StringIO     # Python2
    f = StringIO(strings)
except ImportError:
    from io import BytesIO            # Python3
    f = BytesIO(strings.encode())

c = lambda s: float(s.decode().replace('+', 'e').replace('-', 'e-'))

data = np.genfromtxt(f, converters=dict(zip(range(3), [c]*3)))

print(repr(data))
# array([  3.29988000e-105,   3.29988000e+103,   3.29988000e-001])
Sign up to request clarification or add additional context in comments.

Comments

2

The accepted answer is helpful, but does not support negative values (-0.3299880 is converted to e-0.3299880) or 2-digit exponents (0.3299880E+10 is converted to 0.3299880Ee10), which both do not make sense and would result in nan values in the numpy array.

Also, the number of columns in the file to read is hard-coded (it is 3 in this case).

It can be addressed as follows:

import re
import numpy as np

def read_fortran_data_file(file):
    # count the columns in the first row of data
    number_columns = np.genfromtxt(file, max_rows=1).shape[0]

    c = lambda s: float(re.sub(r"(\d)([\+\-])(\d)", r"\1E\2\3", s.decode()))

    # actually load the content of our file
    data = np.genfromtxt(file,
        converters=dict(zip(range(number_columns), [c] * number_columns)),)

Testing

np.genfromtext accepts filenames or arrays of strings as input. For the demonstration I'll use the latter, but the above function works fine with filenames as input.

strings = [
    "0.3299880-104 0.3299880E+10 0.3299880 0.3299880+104 0.3299880E-10 -0.3299880"
]
read_fortran_data_file(strings)
## array([ 3.29988e-105,  3.29988e+009,  3.29988e-001,  3.29988e+103,
##         3.29988e-011, -3.29988e-001])

Note on NaN values:

When using np.genfromtxt, one must be careful with NaN values that would replace numbers that were not read properly, e.g. using the following assertion:

assert np.count_nonzero(np.isnan(data))==0, "data contains nan values"

Comments

0

Not numpy, but I use the following regex and function:

import re

# convert d/D to e and add it if missing
fortregexp = re.compile(r'([\d.])[dD]?(((?<=[dD])[+-]?|[+-])\d)')
def fortran_float(num):
    num = fortregexp.sub(r'\1e\2', num)
    return float(num)

text = "0.3299880-104 0.3299880D+10 0.3299880 0.3299880+104 0.3299880E-10 -0.3299880"

nums = [fortran_float(i) for i in text.split()]

print(text)
print(nums)

which gives:

0.3299880-104 0.3299880D+10 0.3299880 0.3299880+104 0.3299880E-10 -0.3299880
[3.29988e-105, 3299880000.0, 0.329988, 3.29988e+103, 3.29988e-11, -0.329988]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.