1

I have two .CSV files:

One is a dataset with over 1500 features and 300 samples,

the second is an RFECV ranking of features: Image showing an example of two files

I'm trying to remove each column of a feature from the dataset, that does not have a ranking of 1.

So we only should have something like this:

enter image description here

What would be the proper way of doing something like that in Python?

I was thinking of transposing the second array, finding the indexes with ones and moving columns with these indexes from the dataset to an another array.

2 Answers 2

2

Try:

rank_1 = df2[df2.Ranking == 1].Features
new_df = df1[rank_1]
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much, It's exactly how I needed it and way simpler too!
1
import pandas as pd

df1 = pd.read_csv("path-to-first-csv-file.csv")
df2 = pd.read_csv("path-to-second-csv-file.csv")

result = df1[df2[df2["Ranking"] == 1]["Features"]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.