I have csv data that looks like this and I'm trying to read it into a pandas df and I've tired all sorts of combinations given the ample documentation online - I've tried things like:
pd.read_csv("https://www.nwrfc.noaa.gov/natural/nat_norm_text.cgi?id=TDAO3.csv", delimiter=',', skiprows=0, low_memory=False)
and I get this error -
ParserError: Error tokenizing data. C error: Expected 1 fields in line 3, saw 989
Or, like this but get an empty dataframe:
pd.read_csv('https://www.nwrfc.noaa.gov/natural/nat_norm_text.cgi?id=TDAO3.csv', skiprows=2,
skipfooter=3,index_col=[0], header=None,
engine='python', # c engine doesn't have skipfooter
sep='delimiter')
Out[31]:
Empty DataFrame
Columns: []
Index: []
The first 10 lines of the csv file look like this:
# Water Supply Monthly Volumes for COLUMBIA - THE DALLES DAM (TDAO3)
# Volumes are in KAF
ID,Calendar Year,Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
TDAO3,1948,,,,,,,,,,6866.8,4307.04,4379.38
TDAO3,1949,3546.71,4615.1,8513.31,15020.45,35251.67,21985.99,11226.06,6966.73,4727.37,4406.29,5266.74,5595.91
TDAO3,1950,4353.86,5540.21,9696.27,12854.81,23359.51,39246.78,23393.23,9676.77,5729.74,6990.31,8300.03,8779.57
TDAO3,1951,8032.32,10295.98,7948.59,16144.8,36000.88,28334.09,19735.49,9308.15,6546.95,8907.1,6461.14,6425.76
TDAO3,1952,4671,6222.25,6551.62,18678.3,34866.91,27120.65,15994.18,7907.55,4810.39,3954.32,3259.29,3231.49
TDAO3,1953,7839.72,7870.96,6527.74,9474.66,23384.47,32668.32,17422.63,8655.16,5220.04,5130.46,5183.5,5915.14
TDAO3,1954,5197.51,5967.07,6718.36,10813.69,29190.37,32673.26,29624.38,13456.13,9165.78,5440.92,5732.22,4973.53
thank you,
skiprows=2sep='delimiter'then I see some data but they different.<br>in my data - it seems it is not pure CSV but HTML displaying data. And this can make problem. maybe it needssep='<br>'