0

I have a question similar to this one but in my case, the column with the values I need to check for extracting the rows in the dataframe holds a list of list, not a numeric value.

My data looks like this:

import pandas as pd 

data = {
    'A' : [1, 2, 3, 4, 5],
    'B' : [[[1, 2], [3, 4]], [[0, 2], [5, 6]], [[1, 3], [7, 8]], [[0, 4], [9, 10]], [[1, 5], [11, 12]]]
}
dataF = pd.DataFrame(data)
print(dataF)

I need to extract the rows in the dataframe based on the value of the first element of the first list in each row for B. This value will always be 0 or 1.

Once this problem is solved I will have a dataframe looking like:

import pandas as pd 

data = {
    'A' : [1, 2, 3, 4, 5],
    'B' : [[[1, 2], [3, 4]], [[0, 2], [5, 6]], [[1, 3], [7, 8]], [[0, 4], [9, 10]], [[1, 5], [11, 12]]],
    'C' : [[[0, 2], [3, 4]], [[1, 2], [5, 6]], [[0, 3], [7, 8]], [[0, 4], [9, 10]], [[1, 5], [11, 12]]]
}
dataF = pd.DataFrame(data)
print(dataF)

From this dataframe I need to take all rows in which the first element of the first list in B or C is 1. This means rows 0, 1, 2, 4

EDIT based on the answer from WeNYoBen:

To extract all rows from a data frame in which the first element of the first list in B or C is 1, I am using the code below. However, this way to solve my problem requires to check for duplicate rows in extDF and to sort extDF by the values in one column. I guess there is a way to do this that does not require this two steps.

import pandas as pd 

data = {
    'A' : [1, 2, 3, 4, 5],
    'B' : [[[1, 2], [3, 4]], [[0, 2], [5, 6]], [[1, 3], [7, 8]], [[0, 4], [9, 10]], [[1, 5], [11, 12]]],
    'C' : [[[0, 2], [3, 4]], [[1, 2], [5, 6]], [[0, 3], [7, 8]], [[0, 4], [9, 10]], [[1, 5], [11, 12]]]
}
dataF = pd.DataFrame(data)


extDF = pd.DataFrame(columns=['A', 'B', 'C'])

for i in [1, 2]:
    tempDF = dataF[dataF.iloc[:,i].str[0].str[0].isin([1])].copy()
    extDF = extDF.append(tempDF)

extDF.drop_duplicates(keep='first', inplace=True, subset='A')
extDF.sort_values(by='A', inplace=True)
extDF.reset_index(drop=True, inplace=True)

print(extDF)
2
  • You need to filter the dataframe? what does that mean? filter how? And the same value as before? before what, where? Commented Jul 22, 2019 at 21:19
  • confused by the word filter Commented Jul 22, 2019 at 21:22

1 Answer 1

1

Base on what you described

Newdf=dataF[dataF.B.str[0].str[0].isin([0,1])].copy()
Sign up to request clarification or add additional context in comments.

1 Comment

This is what I need for one column. Any idea about how to extend this to multiple columns like in the second part of the question?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.