I have a dataframe that is large: 100000 rows * 10000 cols
Now I'm given a list of labels (call this list1) that do not match exactly with the labels of the columns in this dataframe, but match part of these labels. For example, a label in the dataframe might be "string1,D111" and the labels in list1 might look like "D111".
So now basically I want to find out all these corresponding columns using list1, and then sum all these columns, what is the most efficient way to do this?
Dataframe:
string1,D111 string2,D222 string3,D333 ...... stringn,Dnnn
1 .. .. .. ..
2
3
4
5
6
...
My list1: D111, D333,...Dxxx