4

I am writing a code for baseline correction of multiple signals. The structure of the code is like this.

# for each file in a directory
    #read file and populate X vector
    temp = baseline_als(x,1000,0.00001)
    plt.plot(x-temp)
    plt.savefig("newbaseline.png")
    plt.close()

The baseline_als function is as below.

def baseline_als(y, lam, p, niter=20):
        L = len(y)
        D = sparse.csc_matrix(np.diff(np.eye(L), 2))
        w = np.ones(L)
        for i in xrange(niter):
            W = sparse.spdiags(w, 0, L, L)
            Z = W + lam * D.dot(D.transpose())
            z = spsolve(Z, w*y)
            w = p * (y > z) + (1-p) * (y < z)
        return z

Now when I put around 100 files in a directory, the code works fine, although it takes time since the complexity is quite high. But when I have around 10000 files in my directory and then I run this script, the system freezes after few minutes. I don't mind a delay in execution, but is there anyway that the script should finish execution?

5
  • Have you run any sort of system monitor when the code "freezes"? Commented Jul 1, 2016 at 7:03
  • I am unsure how can I run a system monitor. Since mouse and keyboard becomes unresponsive and I have to reboot. Commented Jul 1, 2016 at 7:33
  • 1
    You don't say which operating system you use. Start the monitor before you start your program. If you have to reboot then something else might be happening. Have you shown your whole code? Commented Jul 1, 2016 at 7:44
  • I am using ubuntu 14.04. Yes whole code except the file reading part. Ok, I will try with the system monitor started before executing now. Commented Jul 1, 2016 at 7:45
  • With single core ? No ! Without threading ? No! Your processor is alive ? Commented Jul 1, 2016 at 8:00

2 Answers 2

1

I was able to prevent my CPU from reaching 100% and then getting freezes by using time.sleep(0.02). It takes a long time but completes execution nonetheless.

Note that you need to import time before using this.

Sign up to request clarification or add additional context in comments.

Comments

1

in the script is consumed too much RAM when you run it over a too large number of files, see Why does a simple python script crash my system

The process in that your program runs stores the arrays and variables for the calculations in process memory which is ram and there they accumulate

A possible workaround is to run the baseline_als() function in a child process. When the child returns the memory is freed automatically, see Releasing memory in Python

execute function in child process:

from multiprocessing import Process, Queue

def my_function(q, x):
 q.put(x + 100)

if __name__ == '__main__':
 queue = Queue()
 p = Process(target=my_function, args=(queue, 1))
 p.start()
 p.join() # this blocks until the process terminates
 result = queue.get()
 print result

copied from: Is it possible to run function in a subprocess without threading or writing a separate file/script

by this you prevent that ram is consumed by unreferenced old variables that your process (program) produces

another possibility is maybe to invoke of the garbage collector gc.collect() however this is not recommended (not working in some cases)

More useful links:

memory usage, how to free memory

Python large variable RAM usage

I need to free up RAM by storing a Python dictionary on the hard drive, not in RAM. Is it possible?

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.