Numpy how to? ...Peak finding "prominance" calculation in numpy syntax

Question

A somewhat new to numpy user, trying to see a "how to in numpy" solution to an algorithm-in-numpy-syntax problem.

I hope it's OK to ask for some help here, as I'm stumped and would love some advice - I hope this isn't TLDR'ish !!

Background:

I'm trying to calculate "prominence" manually to remove peaks from peak-finding, and trying to see if there is a simple "numpy" way to do that, rather than for loops.

Before anyone says, I know scipy.find_peaks can remove peaks with specified prominence, but I was trying to figure out how to do it by-hand using numpy syntax, not for looping over indexes (I know C, I'm trying to learn numpy...):

Reason:

The reason is I need to change the existing scipy logic from "if prominence on either left or right side of a peak is OK (ie big enough), then the peak should remain" to "if either left or right prominence is too small, then delete the peak".

This is equivalent to changing the max(left_min, right_min) code on line 252 of the scipy code to a min(left_min, right_min)

The scipy _peak_prominences code to find prominence values uses for loops foward/back over the arrays - I was trying to learn if that is possibly in numpy notation/syntax (without for looping), but after a day's effort, couldn't figure out how to do that - so would love some numpy advice!!

The conceptual logic is simple: find peaks in a profile (1d array), then calculate the height of each peak over the left/right neighbouring valleys (ie the "prominence"), and then remove peaks whose left or right height is less than a certain threshold - ie they aren't "peaky" enough on both sides.

In the example above I want to remove the two center peaks, because the height (prominence) of both over the central "valley" is too small - even though the surrounding two outside valleys heights OK.

So, my numpy help question:

I know if I have a peaks array that is indexes into a profile array, I can use np.delete to remove "not peaky enough peaks", if I can calculate a left_height and right_height set of arrays for each peak:

np.delete(peaks, np.where( (left_height<10) | (right_height<10) )[0])

I know for specific values in a peaks array, I can find the "valley" between them by slicing between peak indexes, and finding the min profile value in the slice:

profile[peaks[0]:peaks[1]].min()

what I can't figure out is how to, numpy-wise, calculate the left_height and right_height arrays. Something like

left_height = profile[peaks] - profile[peaks[-1]:peaks].min()

this is essentially the logic of "use a peaks array, with current and previous position, to make a slice, then use that slice to get the profile values, then calculate the minimum over those profile values.

Is that kind of logic possible using just python-istic numpy, or does it need a manual for loop?

Community · Accepted Answer · 2020-06-20 09:12:55Z

2

That's an great way to learn! :)

The first version of peak_prominences was actually added as a pure Python implementation which you can find here.

That implementation still loops over each peak once and is obviously slower than the Cython implementation but may still be what you are looking for. You'd essentially want to change how the contour height is calculated on the lines 403 and 404 as you suggested yourself already.

what I can't figure out is how to, numpy-wise, calculate the left_height and right_height arrays. Something like

left_height = profile[peaks] - profile[peaks[-1]:peaks].min()

You seem to have gotten quite close to to the solution in at line 382 to 400 where the left and right height are calculated (ignore the window related stuff earlier). Here is the crucial bit that finds the left and right height for a single peak using your variable names:

# Positions where profile is larger than current peak height
greater_peak = np.where(profile > profile[peak])[0]

try:
    # Nearest position to the left of peak with
    # profile[left] > profile[peak]
    left = greater_peak[greater_peak < peak].max()
except ValueError:
    left = 0
try:
    # Nearest position to right of peak with
    # profile[right] > profile[peak]
    right = greater_peak[greater_peak > peak].min()
except ValueError:
    right = None

# Base indices to the left and right of peak in profile
left_height = profile[left:peak].argmin() + left
right_height = profile[peak:right].argmin() + peak

I hope this helps in your endeavor.

Disclaimer: I'm the author of that function.

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Sep 7, 2019 at 0:05

lagru

2312 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Richard Over a year ago

Hey thanks for the quick reply! ...and great to hear from you as you were the original author! I'm encouraged that even you did if with an outer for loop over the indicies - i was trying to figure out if it's possible to do that purely python-numpy'ish that was stumping me! I'll use a similiar approach to manually get the slices in a for loop

Richard Over a year ago

Yes - on line 403 I was essentially looking for the logic there to be a min rather than max (ie a "highest_contour"). ...I'll just decorate the existing scipy code with a couple of lines to do the same thing and good to go. thanks again!

Collectives™ on Stack Overflow

Numpy how to? ...Peak finding "prominance" calculation in numpy syntax

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related