1

I have a dataframe that is large: 100000 rows * 10000 cols

Now I'm given a list of labels (call this list1) that do not match exactly with the labels of the columns in this dataframe, but match part of these labels. For example, a label in the dataframe might be "string1,D111" and the labels in list1 might look like "D111".

So now basically I want to find out all these corresponding columns using list1, and then sum all these columns, what is the most efficient way to do this?

Dataframe:
       string1,D111       string2,D222          string3,D333   ......    stringn,Dnnn
1         ..                   ..                     ..                     ..
2
3
4
5
6
...


My list1:  D111, D333,...Dxxx

1 Answer 1

8
In [28]: df = DataFrame(randn(10,10),columns=[ 'c_%s' % i for i in range(3)] + ['d_%s' % i for i in range(3) ] + ['e_%s' % i for i in range(4)])

In [3]: df.filter(regex='d_|e_')
Out[3]: 
        d_0       d_1       d_2       e_0       e_1       e_2       e_3
0 -0.022661 -0.504317  0.279227  0.286951 -0.126999 -1.658422  1.577863
1  0.501654  0.145550 -0.864171 -0.374261 -0.399360  1.217679  1.357648
2 -0.608580  1.138143  1.228663  0.427360  0.256808  0.105568 -0.037422
3 -0.993896 -0.581638 -0.937488  0.038593 -2.012554 -0.182407  0.689899
4  0.424005 -0.913518  0.405155 -1.111424 -0.180506  1.211730  0.118168
5  0.701127  0.644692 -0.188302 -0.561400  0.748692 -0.585822  1.578240
6  0.475958 -0.901369 -0.734969  1.090093  1.297208  1.140128  0.173941
7 -0.679514 -0.790529 -2.057733  0.420175  1.766671 -0.797129 -0.825583
8 -0.918645  0.916237  0.992001 -0.440573 -1.875960 -1.223502  0.084821
9  1.096687 -1.414057 -0.268211  0.253461 -0.175931  1.481261 -0.200600
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.