Reading column names alone in a csv file

Question

I have a csv file with the following columns:

id,name,age,sex

Followed by a lot of values for the above columns. I am trying to read the column names alone and put them inside a list.

I am using Dictreader and this gives out the correct details:

with open('details.csv') as csvfile:
    i=["name","age","sex"]
    re=csv.DictReader(csvfile)
    for row in re:
        for x in i:
            print row[x]

But what I want to do is, I need the list of columns, ("i" in the above case)to be automatically parsed with the input csv than hardcoding them inside a list.

with open('details.csv') as csvfile:
   
    rows=iter(csv.reader(csvfile)).next()
    header=rows[1:]
    re=csv.DictReader(csvfile)
    for row in re:
        print row
        for x in header:
            
            print row[x]

This gives out an error

Keyerrror:'name'

in the line print row[x]. Where am I going wrong? Is it possible to fetch the column names using Dictreader?

Youll get the error: Dictreader instance has no attribute "getitem" — Tania
– Tania, Commented Mar 3, 2015 at 16:35
id,name,age,sex 100101,Herbert,21,m 100102,Keith,18,m 100103,Jennifer,15,f — Tania
– Tania, Commented Mar 3, 2015 at 16:44

user3194712 · Accepted Answer · 2024-01-23 17:20:59Z

154

Though you already have an accepted answer, I figured I'd add this for anyone else interested in a different solution-

The csv module's DictReader object has a public attribute called fieldnames (as of Python 2.6 and above). https://docs.python.org/3.4/library/csv.html#csv.csvreader.fieldnames

An implementation could be as follows:

import csv

with open('C:/mypath/to/csvfile.csv', 'r') as f:
    dict_reader = csv.DictReader(f)

    #get header fieldnames from DictReader and store in list
    headers = dict_reader.fieldnames

    #sample file reading logic
    for line in dict_reader:
        print(line[headers[0]])

In the above, dict_reader.fieldnames returns a list of your headers (assuming the headers are in the top row). Which allows...

>>> print(headers)
['MyColumn1', 'MyColumn2', 'MyColumn3']

If your headers are in, say the 2nd row (with the very top row being row 1), you could do as follows:

import csv

with open('C:/mypath/to/csvfile.csv', 'r') as f:
    #you can eat the first line before creating DictReader.
    #if no "fieldnames" param is passed into
    #DictReader object upon creation, DictReader
    #will read the upper-most line as the headers
    f.readline()
    
    dict_reader = csv.DictReader(f)
    headers = dict_reader.fieldnames

    #sample file reading logic
    for line in dict_reader:
        print(line[headers[0]])

edited Jan 23, 2024 at 17:20

answered Mar 3, 2015 at 19:03

user3194712

1,7552 gold badges10 silver badges9 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Tania Over a year ago

This is a neat solution! :)

teewuane Over a year ago

When I "upgraded" to python3 the fieldnames property now returns None. According to the docs it looks like it should still work, but it doesn't. For what it's worth. I'm using python 3.7. I think there was a change in what the DictReader returns in python3.6.

user3194712 Over a year ago

Interesting - for what it's worth, a hello world CSV reader in Python 3.7 gives proper fieldnames for me

s3dev Over a year ago

The DictReader.fieldnames solution is very efficient. More efficient than other methods mentioned in this post, according to my timing. Thanks!

Eric O. Lebigot · Accepted Answer · 2020-09-06 20:40:31Z

84

You can read the header by using the next() function which return the next row of the reader’s iterable object as a list. then you can add the content of the file to a list.

import csv
with open("C:/path/to/.filecsv", "rb") as f:
    reader = csv.reader(f)
    i = reader.next()
    rest = list(reader)

Now i has the column's names as a list.

print i
>>>['id', 'name', 'age', 'sex']

Also note that reader.next() does not work in python 3. Instead use the the inbuilt next() to get the first line of the csv immediately after reading like so:

import csv
with open("C:/path/to/.filecsv", "rb") as f:
    reader = csv.reader(f)
    i = next(reader)

    print(i)
    >>>['id', 'name', 'age', 'sex']

edited Sep 6, 2020 at 20:40

Eric O. Lebigot

95.1k49 gold badges223 silver badges263 bronze badges

answered Mar 3, 2015 at 16:48

Daniel

5,4095 gold badges37 silver badges50 bronze badges

7 Comments

Tania Over a year ago

Thanks for your solution. But I want to access them just by column names. Suppose I need just the last 3 columns... I have to do them with index i this case..eg i[0] and so on... Is it possible to do this with dictreader? I can just access row[name] then. I just need my code to support multiple columns. and more than indexes I am concerned with the column names.

Daniel Over a year ago

Kind of. So for example if you access names, you want to retrieve all the names in the file?

Daniel Over a year ago

Cool, sorry I didn't understand clearly what you needed. Glad I could help

Tania Over a year ago

Yes I figured that out. Thanks for the solution. Your iterator worked like a CHARM.

Tyler Dane Over a year ago

Python3 ppl should do: with open(csv_path, "rt") as f: Open in text mode to avoid iterator error (csv.Error: iterator should return strings, not bytes)

|

nmvega · Accepted Answer · 2018-12-06 13:34:30Z

The csv.DictReader object exposes an attribute called fieldnames, and that is what you'd use. Here's example code, followed by input and corresponding output:

import csv
file = "/path/to/file.csv"
with open(file, mode='r', encoding='utf-8') as f:
    reader = csv.DictReader(f, delimiter=',')
    for row in reader:
        print([col + '=' + row[col] for col in reader.fieldnames])

Input file contents:

col0,col1,col2,col3,col4,col5,col6,col7,col8,col9
00,01,02,03,04,05,06,07,08,09
10,11,12,13,14,15,16,17,18,19
20,21,22,23,24,25,26,27,28,29
30,31,32,33,34,35,36,37,38,39
40,41,42,43,44,45,46,47,48,49
50,51,52,53,54,55,56,57,58,59
60,61,62,63,64,65,66,67,68,69
70,71,72,73,74,75,76,77,78,79
80,81,82,83,84,85,86,87,88,89
90,91,92,93,94,95,96,97,98,99

Output of print statements:

['col0=00', 'col1=01', 'col2=02', 'col3=03', 'col4=04', 'col5=05', 'col6=06', 'col7=07', 'col8=08', 'col9=09']
['col0=10', 'col1=11', 'col2=12', 'col3=13', 'col4=14', 'col5=15', 'col6=16', 'col7=17', 'col8=18', 'col9=19']
['col0=20', 'col1=21', 'col2=22', 'col3=23', 'col4=24', 'col5=25', 'col6=26', 'col7=27', 'col8=28', 'col9=29']
['col0=30', 'col1=31', 'col2=32', 'col3=33', 'col4=34', 'col5=35', 'col6=36', 'col7=37', 'col8=38', 'col9=39']
['col0=40', 'col1=41', 'col2=42', 'col3=43', 'col4=44', 'col5=45', 'col6=46', 'col7=47', 'col8=48', 'col9=49']
['col0=50', 'col1=51', 'col2=52', 'col3=53', 'col4=54', 'col5=55', 'col6=56', 'col7=57', 'col8=58', 'col9=59']
['col0=60', 'col1=61', 'col2=62', 'col3=63', 'col4=64', 'col5=65', 'col6=66', 'col7=67', 'col8=68', 'col9=69']
['col0=70', 'col1=71', 'col2=72', 'col3=73', 'col4=74', 'col5=75', 'col6=76', 'col7=77', 'col8=78', 'col9=79']
['col0=80', 'col1=81', 'col2=82', 'col3=83', 'col4=84', 'col5=85', 'col6=86', 'col7=87', 'col8=88', 'col9=89']
['col0=90', 'col1=91', 'col2=92', 'col3=93', 'col4=94', 'col5=95', 'col6=96', 'col7=97', 'col8=98', 'col9=99']

smassey · Accepted Answer · 2021-01-22 11:39:56Z

6

How about

with open(csv_input_path + file, 'r') as ft:
    header = ft.readline() # read only first line; returns string
    header_list = header.split(',') # returns list

I am assuming your input file is CSV format. If using pandas, it takes more time if the file is big size because it loads the entire data as the dataset.

edited Jan 22, 2021 at 11:39

smassey

5,89326 silver badges38 bronze badges

answered Mar 27, 2019 at 0:20

Shriganesh Kolhe

2534 silver badges5 bronze badges

Comments

Tania · Accepted Answer · 2015-03-03 17:09:15Z

2

Thanking Daniel Jimenez for his perfect solution to fetch column names alone from my csv, I extend his solution to use DictReader so we can iterate over the rows using column names as indexes. Thanks Jimenez.

with open('myfile.csv') as csvfile:

    rest = []
    with open("myfile.csv", "rb") as f:
        reader = csv.reader(f)
        i = reader.next()
        i=i[1:]
        re=csv.DictReader(csvfile)
        for row in re:
            for x in i:
                print row[x]

answered Mar 3, 2015 at 17:09

Tania

1,9252 gold badges19 silver badges41 bronze badges

Comments

J11 · Accepted Answer · 2018-07-18 17:45:29Z

2

I am just mentioning how to get all the column names from a csv file. I am using pandas library.

First we read the file.

import pandas as pd
file = pd.read_csv('details.csv')

Then, in order to just get all the column names as a list from input file use:-

columns = list(file.head(0))

answered Jul 18, 2018 at 17:45

J11

4734 silver badges8 bronze badges

Comments

Adnan Ali · Accepted Answer · 2018-09-11 05:44:42Z

2

here is the code to print only the headers or columns of the csv file.

import csv
HEADERS = next(csv.reader(open('filepath.csv')))
print (HEADERS)

Another method with pandas

import pandas as pd
HEADERS = list(pd.read_csv('filepath.csv').head(0))
print (HEADERS)

edited Sep 11, 2018 at 5:44

answered Sep 11, 2018 at 5:25

Adnan Ali

437 bronze badges

Comments

vishnuteja · Accepted Answer · 2020-10-17 06:08:30Z

0

import pandas as pd
data = pd.read_csv("data.csv")
cols = data.columns

answered Oct 17, 2020 at 6:08

vishnuteja

91 bronze badge

1 Comment

Yunnosch Over a year ago

While this code may solve the question, including an explanation of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please edit your answer to add explanations and give an indication of what limitations and assumptions apply.

Aqsa javed · Accepted Answer · 2021-11-25 05:55:39Z

0

Using pandas is also an option.

But instead of loading the full file in memory, you can retrieve only the first chunk of it to get the field names by using iterator.

import pandas as pd

file = pd.read_csv('details.csv'), iterator=True)
column_names_full=file.get_chunk(1)
column_names=[column for column in column_names_full]
print column_names

answered Nov 25, 2021 at 5:55

Aqsa javed

3192 silver badges10 bronze badges

Comments

ron_g · Accepted Answer · 2020-12-14 16:19:46Z

-1

I literally just wanted the first row of my data which are the headers I need and didn't want to iterate over all my data to get them, so I just did this:

with open(data, 'r', newline='') as csvfile:
t = 0
for i in csv.reader(csvfile, delimiter=',', quotechar='|'):
    if t > 0:
        break
    else:
        dbh = i
        t += 1

answered Dec 14, 2020 at 16:19

ron_g

1,7332 gold badges25 silver badges43 bronze badges

Collectives™ on Stack Overflow

Reading column names alone in a csv file

10 Answers 10

4 Comments

7 Comments

Comments

Comments

Comments

Comments

Comments

1 Comment

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

10 Answers 10

4 Comments

7 Comments

Comments

Comments

Comments

Comments

Comments

1 Comment

Comments

Comments

Linked

Related