1

I have a large numpy.ndarray and I need to downsample this array based on the value of one column. My solution works, but is very slow

data_table = data_table[[i for i in range(0, len(data_table)) if data_table[i][7] > 0.2 and data_table[i][7] < 0.75]]

does anybody know what the fastest way is to do this?

1 Answer 1

2

Use column-slicing to select relevant columns and compare those against the thresholds in a vectorized manner to give us a mask of valid rows and then index into the rows for the rows filtered output -

out = data_table[(data_table[:,7] > 0.2) & (data_table[:,7] < 0.75)]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.