3

Now I have a huge data ndarray as below, which contains 200 million rows.

[['5040' '5' 'load_video' '2015-03-30 12:31:27.727452'
['5040' '44' 'load_video' '2015-03-30 12:33:26.764407']
['5040' '34' 'load_video' '2015-03-30 12:31:26.102226']
 ..., 
['3076' '1' 'play_video' '2015-05-31 05:52:33.395859']
['3076' '1' 'seek_video' '2015-05-31 05:52:36.941808']
['1512' '8' 'load_video' '2015-05-31 07:19:56.715000']]]

What I want to do is to delete all rows that contain 'load_video" string. Is there any solution to do this?

PS: What I want to do next is to sort the rows according to the first column, but since it's string type and I found it impossible to use astype to only change the first row into int, what can I do?

These might be some very simple questions, but since I am new to python, your answers would help me a lot. Thanks!

7
  • Since the data set is quite huge, I think solution with np.apply_along_axis() would be better. Commented Nov 27, 2017 at 3:41
  • Aside: if you're working with non-numerical data you're probably going to have a much better time using pandas than pure numpy. Commented Nov 27, 2017 at 3:44
  • @DSM You mean using a dataframe from pandas? Commented Nov 27, 2017 at 3:47
  • ndarray seems to be an odd choice for this data structure. pandas dataframe is more suitable. Commented Nov 27, 2017 at 3:49
  • @Vinnton: yep. For tabular data with different column types, using a frame will make life much easier. Commented Nov 27, 2017 at 4:21

1 Answer 1

2

This is the python to remove any entry with 'load_video'.

new_list = [x for x in old_list if 'load_video' not in x]
Sign up to request clarification or add additional context in comments.

3 Comments

But unfortunately I got an error AttributeError: 'numpy.ndarray' object has no attribute 'split' ?
I modify your solution to TrainData = [x for x in TrainData if x[2] != 'load_video'] and it works! Thanks a lot!
Yes, you'd need to use your own list name instead of old_list, glad it works!!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.