Python numpy, skip columns & read csv file

Question

I've got a CSV file with 20 columns & about 60000 rows.

I'd like to read fields 2 to 20 only. I've tried the below code but the browser(using ipython) freezes & it just goes n for ages

import numpy as np
from numpy import genfromtxt

myFile = 'sampleData.csv'
myData = genfromtxt(myFile, delimiter=',', usecols(2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19)
print myData

How could I tweak this to work better & actually produce output please?

I'd think the reading is fast, it's the printing that takes the time. Time just the read. Then print only what you need. Try myData[:10] etc. Do you have missing values, are you getting error messages? — roadrunner66
– roadrunner66, Commented Mar 2, 2016 at 4:53
genfromtxt() is notoriously slow. Try loadtxt() which is marginally faster or read it as a pandas dataframe which is apparently much faster. You can use the read_csv() function — bunji
– bunji, Commented Mar 2, 2016 at 5:06

usert4jju7 · Accepted Answer · 2016-03-02 05:35:01Z

2

import pandas as pd

myFile = 'sampleData.csv'
df = pd.DataFrame(pd.read_csv(myFile,skiprows=1)) // Skipping header

print df

This works like a charm

answered Mar 2, 2016 at 5:35

usert4jju7

1,8637 gold badges33 silver badges68 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python numpy, skip columns & read csv file

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related