Use string literal instead of header name in Pandas csv file manipulation

Question

Python 3.9.5/Pandas 1.1.3

I use the following code to create a nested dictionary object from a csv file with headers:

import pandas as pd
import json
import os

csv = "/Users/me/file.csv"
csv_file = pd.read_csv(csv, sep=",", header=0, index_col=False)
csv_file['org'] = csv_file[['location', 'type']].apply(lambda s: s.to_dict(), axis=1)

This creates a nested object called org from the data in the columns called location and type.

Now let's say the type column doesn't even exist in the csv file, and I want to pass a literal string as a type value instead of the values from a column in the csv file. So for example, I want to create a nested object called org using the values from the data column as before, but I want to just use the string foo for all values of a key called type. How to accomplish this?

Serge Ballesta · Accepted Answer · 2021-09-14 13:11:18Z

1

You could just build it by hand:

csv_file['org'] = csv_file['location'].apply(lambda x: {'location': x,
                                                        'type': 'foo'})

answered Sep 14, 2021 at 13:11

Serge Ballesta

150k13 gold badges137 silver badges267 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Stpete111 Over a year ago

Hi - I like that this uses less code than the other answer - but can you help me with putting it in the context of my code. For example, when I try to implement your suggestion into my code, I'm getting errors around the s.to_dict() and axis=1 arguments.

Serge Ballesta Over a year ago

@Stpete111: this is already in the context of your own code. Did you try it as is?

Stpete111 Over a year ago

Sorry about that, I had assumed I still needed to consider the other arguments in addition to your code. It does work, and it does so in a nice pythonic way. Thanks!

Marat · Accepted Answer · 2021-09-14 13:15:41Z

1

use Chainmap. This will allow to use multiple columns (columns_to_use), and even override existing ones (if type is in these columns, it will be overridden):

from collections import ChainMap

# .. some code
csv_file['org'] = csv_file[columns_to_use].apply(
    lambda s: ChainMap({'type': 'foo'}, s.to_dict()), axis=1)

BTW, without adding constant values it could be done by df.to_dict():

csv_file['org'] = csv_file[['location', 'type']].to_dict('records')

edited Sep 14, 2021 at 13:15

answered Sep 14, 2021 at 13:01

Marat

15.9k3 gold badges44 silver badges53 bronze badges

5 Comments

Stpete111 Over a year ago

Sorry, let me clarify - assume that there isn't actually a column called type in the csv file and I'm trying to create that secondary key:value from scratch, where the key is type and the value is foo.. In this case, will your code still work? (I have updated my question to clarify this).

Marat Over a year ago

@Stpete111 it will still work. Just drop it from the list of columns. Updated the answer to reflect that

Stpete111 Over a year ago

Ok so it's not going to give me an error for the column type not existing in the csv file? I will try it now...

Stpete111 Over a year ago

I'm getting maximum recursion level reached error. Trying to diagnose now...

Stpete111 Over a year ago

Ok, recursion error was on me - too many square brackets. But now getting a lambda error.

Collectives™ on Stack Overflow

Use string literal instead of header name in Pandas csv file manipulation

2 Answers 2

3 Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related