I access rows in pandas with the loc function as below:
pdf.loc[pdf.a>2]
Is this vectorised? Is it better than using numpy
pdf[pdf.a>2]
This timing suggests there is no slow down with loc
testa = pd.DataFrame(np.arange(10000000),columns =['q'])
%timeit testb = testa.loc[testa.q>6]
%timeit testc = testa[testa.q>7]
1 loop, best of 3: 207 ms per loop
1 loop, best of 3: 208 ms per loop
loc[]is better then a for loop when you do a conditional update based on columns.numpywill be faster, but then you lose the indices, which are super useful and inherent to pandas.pdf.to_numpy()[np.where(pdf.a > 2)[0]]should be faster than.loc