I have a pandas DataFrame with ~5m rows. I am looking for an efficient method to append / store each rows into a list.
import pandas as pd
df = pd.DataFrame({
'id': [0, 1, 2, 3],
'val': ['w','x','y','z'],
'pos': ['p1','p2','p3','p4']
})
# Using List comprehensions
df_lst = []
[df_lst.append(rows) for rows in df.iterrows()]
Given the size of the DataFrame; I am looking for other methods that are more efficient at storing rows to a list. Is there a vectorized solution to this?
df.to_numpy().tolist()?list(df.iterrows())is equivalent. But... 1. why do you need this? dataframe is the efficient method to store data. 2. You are using list-comprehensions wrong, this should bedf_lst = [rows for rows in df.iterrows()]