34

I have two NumPy arrays x and y. When I try to fit my data using exponential function and curve_fit (SciPy) with this simple code

#!/usr/bin/env python
from pylab import *
from scipy.optimize import curve_fit

x = np.array([399.75, 989.25, 1578.75, 2168.25, 2757.75, 3347.25, 3936.75, 4526.25, 5115.75, 5705.25])
y = np.array([109,62,39,13,10,4,2,0,1,2])

def func(x, a, b, c, d):
    return a*np.exp(b-c*x)+d

popt, pcov = curve_fit(func, x, y)

I get wrong coefficients popt

[a,b,c,d] = [1., 1., 1., 24.19999988]

What is the problem?

1

2 Answers 2

53

First comment: since a*exp(b - c*x) = (a*exp(b))*exp(-c*x) = A*exp(-c*x), a or b is redundant. I'll drop b and use:

import matplotlib.pyplot as plt

def func(x, a, c, d):
    return a*np.exp(-c*x)+d

That isn't the main issue. The problem is simply that curve_fit fails to converge to a solution to this problem when you use the default initial guess (which is all 1s). Check pcov; you'll see that it is inf. This is not surprising, because if c is 1, most of the values of exp(-c*x) underflow to 0:

In [32]: np.exp(-x)
Out[32]: 
array([  2.45912644e-174,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000])

This suggests that c should be small. A better initial guess is, say, p0 = (1, 1e-6, 1). Then I get:

In [36]: popt, pcov = curve_fit(func, x, y, p0=(1, 1e-6, 1))

In [37]: popt
Out[37]: array([  1.63561656e+02,   9.71142196e-04,  -1.16854450e+00])

This looks reasonable:

In [42]: xx = np.linspace(300, 6000, 1000)

In [43]: yy = func(xx, *popt)

In [44]: plt.plot(x, y, 'ko')
Out[44]: [<matplotlib.lines.Line2D at 0x41c5ad0>]

In [45]: plt.plot(xx, yy)
Out[45]: [<matplotlib.lines.Line2D at 0x41c5c10>]

enter image description here

Sign up to request clarification or add additional context in comments.

3 Comments

Why do you use -c instead of c? curve_fit can find a negative c if necessary, no?
@RenéG: That's the convention that drastega used in the question.
Another approach to initial parameters (using default values, that is) is normalizing x to (approximately) 0—1, e.g., ξ=x/k, estimate a, c' and d and eventually have c=c'/k.
9

Firstly I would recommend modifying your equation to a*np.exp(-c*(x-b))+d, otherwise the exponential will always be centered on x=0 which may not always be the case. You also need to specify reasonable initial conditions (the 4th argument to curve_fit specifies initial conditions for [a,b,c,d]).

This code fits nicely:

from pylab import *
from scipy.optimize import curve_fit

x = np.array([399.75, 989.25, 1578.75, 2168.25, 2757.75, 3347.25, 3936.75, 4526.25, 5115.75, 5705.25])
y = np.array([109,62,39,13,10,4,2,0,1,2])

def func(x, a, b, c, d):
    return a*np.exp(-c*(x-b))+d

popt, pcov = curve_fit(func, x, y, [100,400,0.001,0])
print popt

plot(x,y)
x=linspace(400,6000,10000)
plot(x,func(x,*popt))
show()

4 Comments

Where do initial conditions come from?
@MarcinZdunek this was a while ago so I don't remember exactly. The amplitude will have been estimated from the graph. The others may have been determined via trial and error, although the value for c can be estimated too (see the accepted answer of this question)
@MarcinZdunek The default initial values are fine if you normalize both data ranges and afterwards denormalize the estimated parameters...
I'll just add that looking over this again, I think the initial conditions for a and b came from the first y and x values (assuming values are in order), c can be estimated as in the accepted answer, and the estimate for d came from the final y values which are ~0. If you're having trouble with initial conditions, this can be a good starting point.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.