2

I was importing 2GB CSV file with 600 columns to dataframe but it gives memory error every time. Now I want to remove some columns while importing.

Please let me know how to achieve this.

My code:

sourceFileName=r'C:\sunil_plus\dataset\a3rfghj.csv'
data = pd.read_csv(sourceFileName,dtype=object)

Sample Data:

"Country Name","Country Code","Indicator Name","Indicator Code","Counterpart Country Name","Counterpart Country Code",Attribute,1948,1949,1950,1951,1952,1953,1954,1955,1956,1957,1958,1959,1960,1960Q1,1960M1,1960M2,1960M3,1960Q2,1960M4,1960M5,1960M6,1960Q3,1960M7,1960M8,1960M9,1960Q4,1960M10,1960M11,1960M12,1961,1961Q1,1961M1,1961M2,1961M3,1961Q2,1961M4,1961M5,1961M6,1961Q3,1961M7,1961M8,1961M9,1961Q4,1961M10,1961M11,1961M12,1962,1962Q1,1962M1,1962M2,1962M3,1962Q2,1962M4,1962M5,1962M6,1962Q3,1962M7,1962M8,1962M9,1962Q4,1962M10,1962M11,1962M12,1963,1963Q1,1963M1,1963M2,1963M3,1963Q2,1963M4,1963M5,1963M6,1963Q3,1963M7,1963M8,1963M9,1963Q4,1963M10,1963M11,1963M12,1964,1964Q1,1964M1,1964M2,1964M3,1964Q2,1964M4,1964M5,1964M6,1964Q3,1964M7,1964M8,1964M9,1964Q4,1964M10,1964M11,1964M12,1965,1965Q1,1965M1,1965M2,1965M3,1965Q2,1965M4,1965M5,1965M6,1965Q3,1965M7,1965M8,1965M9,1965Q4,1965M10,1965M11,1965M12,1966,1966Q1,1966M1,1966M2,1966M3,1966Q2,1966M4,1966M5,1966M6,1966Q3,1966M7,1966M8,1966M9,1966Q4,1966M10,1966M11,1966M12,1967,1967Q1,1967M1,1967M2,1967M3,1967Q2,1967M4,1967M5,1967M6,1967Q3,1967M7,1967M8,1967M9,1967Q4,1967M10,1967M11,1967M12,1968,1968Q1,1968M1,1968M2,1968M3,1968Q2,1968M4,1968M5,1968M6,1968Q3,1968M7,1968M8,1968M9,1968Q4,1968M10,1968M11,1968M12,1969,1969Q1,1969M1,1969M2,1969M3,1969Q2,1969M4,1969M5,1969M6,1969Q3,1969M7,1969M8,1969M9,1969Q4,1969M10,1969M11,1969M12,1970,1970Q1,1970M1,1970M2,1970M3,1970Q2,1970M4,1970M5,1970M6,1970Q3,1970M7,1970M8,1970M9,1970Q4,1970M10,1970M11,1970M12,1971,1971Q1,1971M1,1971M2,1971M3,1971Q2,1971M4,1971M5,1971M6,1971Q3,1971M7,1971M8,1971M9,1971Q4,1971M10,1971M11,1971M12,1972,1972Q1,1972M1,1972M2,1972M3,1972Q2,1972M4,1972M5,1972M6,1972Q3,1972M7,1972M8,1972M9,1972Q4,1972M10,1972M11,1972M12,1973,1973Q1,1973M1,1973M2,1973M3,1973Q2,1973M4,1973M5,1973M6,1973Q3,1973M7,1973M8,1973M9,1973Q4,1973M10,1973M11,1973M12,1974,1974Q1,1974M1,1974M2,1974M3,1974Q2,1974M4,1974M5,1974M6,1974Q3,1974M7,1974M8,1974M9,1974Q4,1974M10,1974M11,1974M12,1975,1975Q1,1975M1,1975M2,1975M3,1975Q2,1975M4,1975M5,1975M6,1975Q3,1975M7,1975M8,1975M9,1975Q4,1975M10,1975M11,1975M12,1976,1976Q1,1976M1,1976M2,1976M3,1976Q2,1976M4,1976M5,1976M6,1976Q3,1976M7,1976M8,1976M9,1976Q4,1976M10,1976M11,1976M12,1977,1977Q1,1977M1,1977M2,1977M3,1977Q2,1977M4,1977M5,1977M6,1977Q3,1977M7,1977M8,1977M9,1977Q4,1977M10,1977M11,1977M12,1978,1978Q1,1978M1,1978M2,1978M3,1978Q2,1978M4,1978M5,1978M6,1978Q3,1978M7,1978M8,1978M9,1978Q4,1978M10,1978M11,1978M12,1979,1979Q1,1979M1,1979M2,1979M3,1979Q2,1979M4,1979M5,1979M6,1979Q3,1979M7,1979M8,1979M9,1979Q4,1979M10,1979M11,1979M12,1980,1980Q1,1980M1,1980M2,1980M3,1980Q2,1980M4,1980M5,1980M6,1980Q3,1980M7,1980M8,1980M9,1980Q4,1980M10,1980M11,1980M12,1981,1981Q1,1981M1,1981M2,1981M3,1981Q2,1981M4,1981M5,1981M6,1981Q3,1981M7,1981M8,1981M9,1981Q4,1981M10,1981M11,1981M12,1982,1982Q1,1982M1,1982M2,1982M3,1982Q2,1982M4,1982M5,1982M6,1982Q3,1982M7,1982M8,1982M9,1982Q4,1982M10,1982M11,1982M12,1983,1983Q1,1983M1,1983M2,1983M3,1983Q2,1983M4,1983M5,1983M6,1983Q3,1983M7,1983M8,1983M9,1983Q4,1983M10,1983M11,1983M12,1984,1984Q1,1984M1,1984M2,1984M3,1984Q2,1984M4,1984M5,1984M6,1984Q3,1984M7,1984M8,1984M9,1984Q4,1984M10,1984M11,1984M12,1985,1985Q1,1985M1,1985M2,1985M3,1985Q2,1985M4,1985M5,1985M6,1985Q3,1985M7,1985M8,1985M9,1985Q4,1985M10,1985M11,1985M12,1986,1986Q1,1986M1,1986M2,1986M3,1986Q2,1986M4,1986M5,1986M6,1986Q3,1986M7,1986M8,1986M9,1986Q4,1986M10,1986M11,1986M12,1987,1987Q1,1987M1,1987M2,1987M3,1987Q2,1987M4,1987M5,1987M6,1987Q3,1987M7,1987M8,1987M9,1987Q4,1987M10,1987M11,1987M12,1988,1988Q1,1988M1,1988M2,1988M3,1988Q2,1988M4,1988M5,1988M6,1988Q3,1988M7,1988M8,1988M9,1988Q4,1988M10,1988M11,1988M12,1989,1989Q1,1989M1,1989M2,1989M3,1989Q2,1989M4,1989M5,1989M6,1989Q3,1989M7,1989M8,1989M9,1989Q4,1989M10,1989M11,1989M12,1990,1990Q1,1990M1,1990M2,1990M3,1990Q2,1990M4,1990M5,1990M6,1990Q3,1990M7,1990M8,1990M9,1990Q4,1990M10,1990M11,1990M12,1991,1991Q1,1991M1,1991M2,1991M3,1991Q2,1991M4,1991M5,1991M6,1991Q3,1991M7,1991M8,1991M9,1991Q4,1991M10,1991M11,1991M12,1992,1992Q1,1992M1,1992M2,1992M3,1992Q2,1992M4,1992M5,1992M6,1992Q3,1992M7,1992M8,1992M9,1992Q4,1992M10,1992M11,1992M12,1993,1993Q1,1993M1,1993M2,1993M3,1993Q2,1993M4,1993M5,1993M6,1993Q3,1993M7,1993M8,1993M9,1993Q4,1993M10,1993M11,1993M12,1994,1994Q1,1994M1,1994M2,1994M3,1994Q2,1994M4,1994M5,1994M6,1994Q3,1994M7,1994M8,1994M9,1994Q4,1994M10,1994M11,1994M12,1995,1995Q1,1995M1,1995M2,1995M3,1995Q2,1995M4,1995M5,1995M6,1995Q3,1995M7,1995M8,1995M9,1995Q4,1995M10,1995M11,1995M12,1996,1996Q1,1996M1,1996M2,1996M3,1996Q2,1996M4,1996M5,1996M6,1996Q3,1996M7,1996M8,1996M9,1996Q4,1996M10,1996M11,1996M12,1997,1997Q1,1997M1,1997M2,1997M3,1997Q2,1997M4,1997M5,1997M6,1997Q3,1997M7,1997M8,1997M9,1997Q4,1997M10,1997M11,1997M12,1998,1998Q1,1998M1,1998M2,1998M3,1998Q2,1998M4,1998M5,1998M6,1998Q3,1998M7,1998M8,1998M9,1998Q4,1998M10,1998M11,1998M12,1999,1999Q1,1999M1,1999M2,1999M3,1999Q2,1999M4,1999M5,1999M6,1999Q3,1999M7,1999M8,1999M9,1999Q4,1999M10,1999M11,1999M12,2000,2000Q1,2000M1,2000M2,2000M3,2000Q2,2000M4,2000M5,2000M6,2000Q3,2000M7,2000M8,2000M9,2000Q4,2000M10,2000M11,2000M12,2001,2001Q1,2001M1,2001M2,2001M3,2001Q2,2001M4,2001M5,2001M6,2001Q3,2001M7,2001M8,2001M9,2001Q4,2001M10,2001M11,2001M12,2002,2002Q1,2002M1,2002M2,2002M3,2002Q2,2002M4,2002M5,2002M6,2002Q3,2002M7,2002M8,2002M9,2002Q4,2002M10,2002M11,2002M12,2003,2003Q1,2003M1,2003M2,2003M3,2003Q2,2003M4,2003M5,2003M6,2003Q3,2003M7,2003M8,2003M9,2003Q4,2003M10,2003M11,2003M12,2004,2004Q1,2004M1,2004M2,2004M3,2004Q2,2004M4,2004M5,2004M6,2004Q3,2004M7,2004M8,2004M9,2004Q4,2004M10,2004M11,2004M12,2005,2005Q1,2005M1,2005M2,2005M3,2005Q2,2005M4,2005M5,2005M6,2005Q3,2005M7,2005M8,2005M9,2005Q4,2005M10,2005M11,2005M12,2006,2006Q1,2006M1,2006M2,2006M3,2006Q2,2006M4,2006M5,2006M6,2006Q3,2006M7,2006M8,2006M9,2006Q4,2006M10,2006M11,2006M12,2007,2007Q1,2007M1,2007M2,2007M3,2007Q2,2007M4,2007M5,2007M6,2007Q3,2007M7,2007M8,2007M9,2007Q4,2007M10,2007M11,2007M12,2008,2008Q1,2008M1,2008M2,2008M3,2008Q2,2008M4,2008M5,2008M6,2008Q3,2008M7,2008M8,2008M9,2008Q4,2008M10,2008M11,2008M12,2009,2009Q1,2009M1,2009M2,2009M3,2009Q2,2009M4,2009M5,2009M6,2009Q3,2009M7,2009M8,2009M9,2009Q4,2009M10,2009M11,2009M12,2010,2010Q1,2010M1,2010M2,2010M3,2010Q2,2010M4,2010M5,2010M6,2010Q3,2010M7,2010M8,2010M9,2010Q4,2010M10,2010M11,2010M12,2011,2011Q1,2011M1,2011M2,2011M3,2011Q2,2011M4,2011M5,2011M6,2011Q3,2011M7,2011M8,2011M9,2011Q4,2011M10,2011M11,2011M12,2012,2012Q1,2012M1,2012M2,2012M3,2012Q2,2012M4,2012M5,2012M6,2012Q3,2012M7,2012M8,2012M9,2012Q4,2012M10,2012M11,2012M12,2013,2013Q1,2013M1,2013M2,2013M3,2013Q2,2013M4,2013M5,2013M6,2013Q3,2013M7,2013M8,2013M9,2013Q4,2013M10,2013M11,2013M12,2014,2014Q1,2014M1,2014M2,2014M3,2014Q2,2014M4,2014M5,2014M6,2014Q3,2014M7,2014M8,2014M9,2014Q4,2014M10,2014M11,2014M12,2015,2015Q1,2015M1,2015M2,2015M3,2015Q2,2015M4,2015M5,2015M6,2015Q3,2015M7,2015M8,2015M9,2015Q4,2015M10,2015M11,2015M12,2016,2016Q1,2016M1,2016M2,2016M3,2016Q2,2016M4,2016M5,2016M6,2016Q3,2016M7,2016M8,2016M9,2016Q4,2016M10,2016M11,2016M12,2017Q1,2017M1,2017M2,2017M3,
"Advanced Economies","110","Goods, Value of Exports, Free on board (FOB), US Dollars","TXG_FOB_USD","Lao People's Democratic Republic","544","Value",,,,,,,,"600000","7500000","11200000","10100000","9900000","13590000",,,,,,,,,,,,,,,,,"31800000",,,,,,,,,,,,,,,,,"22190000",,,,,,,,,,,,,,,,,"19310000",,,,,,,,,,,,,,,,,"13400000",,,,,,,,,,,,,,,,,"17500000",,,,,,,,,,,,,,,,,"20000000",,,,,,,,,,,,,,,,,"21000000",,,,,,,,,,,,,,,,,"28220000",,,,,,,,,,,,,,,,,"46444000",,,,,,,,,,,,,,,,,"42244000",,,,,,,,,,,,,,,,,"28035000",,,,,,,,,,,,,,,,,"22672000",,,,,,,,,,,,,,,,,"35970000",,,,,,,,,,,,,,,,,"52015000",,,,,,,,,,,,,,,,,"24440000",,,,,,,,,,,,,,,,,"14690000",,,,,,,,,,,,,,,,,"32230000",,,,,,,,,,,,,,,,,"38550000",,,,,,,,,,,,,,,,,"34860000",,,,,,,,,,,,,,,,,"55760000",,,,,,,,,,,,,,,,,"37930000",,,,,,,,,,,,,,,,,"35560000",,,,,,,,,,,,,,,,,"39910000",,,,,,,,,,,,,,,,,"15670000",,,,,,,,,,,,,,,,,"26490000",,,,,,,,,,,,,,,,,"20760000",,,,,,,,,,,,,,,,,"26680000",,,,,,,,,,,,,,,,,"35095211.66717",,,,,,,,,,,,,,,,,"36724275.976975",,,,,,,,,,,,,,,,,"36915312.0753272",,,,,,,,,,,,,,,,,"46665636.4768923",,,,,,,,,,,,,,,,,"66109300.8664534",,,,,,,,,,,,,,,,,"87913998.4608552",,,,,,,,,,,,,,,,,"148354775.637297",,,,,,,,,,,,,,,,,"163819040.691337",,,,,,,,,,,,,,,,,"156024282.72955",,,,,,,,,,,,,,,,,"133137318.554441",,,,,,,,,,,,,,,,,"94399137.7518639",,,,,,,,,,,,,,,,,"117343069.557997",,,,,,,,,,,,,,,,,"117587881.503864",,,,,,,,,,,,,,,,,"98762873.9625738",,,,,,,,,,,,,,,,,"109005712.389615",,,,,,,,,,,,,,,,,"106595482.789968",,,,,,,,,,,,,,,,,"172941555.708648",,,,,,,,,,,,,,,,,"160810422.300987",,,,,,,,,,,,,,,,,"166500301.797628",,,,,,,,,,,,,,,,,"259543675.819615",,,,,,,,,,,,,,,,,"312728794.020366",,,,,,,,,,,,,,,,,"353281591.775144",,,,,,,,,,,,,,,,,"398101531.917045",,,,,,,,,,,,,,,,,"624671291.89195",,,,,,,,,,,,,,,,,"745336092.052549",,,,,,,,,,,,,,,,,"630395816.514461",,,,,,,,,,,,,,,,,"777975419.209323",,,,,,,,,,,,,,,,,"733654481.53557",,,,,,,,,,,,,,,,,"503096458.712403",,,,,,,,,,,,,,,,,,,,,
6
  • 4
    Have you tried to use usecols parameter ? cf. pandas.pydata.org/pandas-docs/stable/generated/… Commented Jul 8, 2017 at 13:42
  • @AdrienMatissart yes I tried, but csv file contains more than 600 columns, I want to remove just 5 columns as it contains full text. Commented Jul 8, 2017 at 13:47
  • 1
    Maybe you could pass usecols a callable to filter the desired columns with a simple condition. We would need more details about your data to help you. Commented Jul 8, 2017 at 13:52
  • 1
    Are the 5 columns you want to exclude consisting of that much data that it could solve your memory problem? Commented Jul 8, 2017 at 13:52
  • @AdrienMatissart updated sample data, I want to exclude columns - "Country Name","Indicator Name","Counterpart Country Name" bcz this contains full text, so size will increase. usecols can be used, but many files having different column names. Commented Jul 8, 2017 at 13:59

2 Answers 2

9

To exclude 3 columns while importing, you could do:

data = pd.read_csv(
 sourceFileName,
 usecols=lambda col: col not in ["Country Name","Indicator Name","Counterpart Country Name"]
)
Sign up to request clarification or add additional context in comments.

2 Comments

Hi Adrien, would you mind explaining that lambda in more detail? How does that work? How does read_csv know what the starting set of columns to exclude from even is? Thanks!
@newbie -- This cool feature works because Pandas.read_csv() was designed to accept either a collection (e.g. list) of column names OR to accept a callable (which means a reference to a function. The thing to remember is that "col" in this example is not a pre-defined variable, it is the parameter of the new anonymous function being created by lambda here. In the docs for this parameter, it has this example: usecols=lambda x: x.upper() in ['AAA', 'BBB', 'DDD'].
0
columns_to_be_removed = ['a', 'b']
data = pd.read_csv(sourceFileName).drop(columns_to_be_removed, axis = 'columns')

1 Comment

Please edit your answer to format code into code blocks to make it easier to read.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.