Is there some way of reading only a particular column with specific index from a csv file using Pandas(preferably read_csv)? I understand that read_csv provides the ability to read specific columns by column names, but the data file has no headers so I cannot use column names. Note that the file is too large, so I do not want to read in the entire file and then subset. Thanks.
2 Answers
Here is an example illustrating the answer given by EdChum. There is a lot of additional options to load a CSV file, check the API reference.
raw_data = {'first_name': ['Steve', 'Guido', 'John'],
'last_name': ['Jobs', 'Van Rossum', "von Neumann"]}
df = pd.DataFrame(raw_data)
# Saving data without header
df.to_csv(path_or_buf='test.csv', header=False)
# Telling that there is no header and loading only the first name
df = pd.read_csv(filepath_or_buffer='test.csv', header=None, usecols=[1], names=['first_name'])
df
first_name
0 Steve
1 Guido
2 John
use_colssupports ordinal based indexing:use_cols=[1,4]will read only 2nd and 5th column