0

Python 3.9.5/Pandas 1.1.3

I use the following code to create a nested dictionary object from a csv file with headers:

import pandas as pd
import json
import os

csv = "/Users/me/file.csv"
csv_file = pd.read_csv(csv, sep=",", header=0, index_col=False)
csv_file['org'] = csv_file[['location', 'type']].apply(lambda s: s.to_dict(), axis=1)

This creates a nested object called org from the data in the columns called location and type.

Now let's say the type column doesn't even exist in the csv file, and I want to pass a literal string as a type value instead of the values from a column in the csv file. So for example, I want to create a nested object called org using the values from the data column as before, but I want to just use the string foo for all values of a key called type. How to accomplish this?

2 Answers 2

1

You could just build it by hand:

csv_file['org'] = csv_file['location'].apply(lambda x: {'location': x,
                                                        'type': 'foo'})
Sign up to request clarification or add additional context in comments.

3 Comments

Hi - I like that this uses less code than the other answer - but can you help me with putting it in the context of my code. For example, when I try to implement your suggestion into my code, I'm getting errors around the s.to_dict() and axis=1 arguments.
@Stpete111: this is already in the context of your own code. Did you try it as is?
Sorry about that, I had assumed I still needed to consider the other arguments in addition to your code. It does work, and it does so in a nice pythonic way. Thanks!
1

use Chainmap. This will allow to use multiple columns (columns_to_use), and even override existing ones (if type is in these columns, it will be overridden):

from collections import ChainMap

# .. some code
csv_file['org'] = csv_file[columns_to_use].apply(
    lambda s: ChainMap({'type': 'foo'}, s.to_dict()), axis=1)

BTW, without adding constant values it could be done by df.to_dict():

csv_file['org'] = csv_file[['location', 'type']].to_dict('records')

5 Comments

Sorry, let me clarify - assume that there isn't actually a column called type in the csv file and I'm trying to create that secondary key:value from scratch, where the key is type and the value is foo.. In this case, will your code still work? (I have updated my question to clarify this).
@Stpete111 it will still work. Just drop it from the list of columns. Updated the answer to reflect that
Ok so it's not going to give me an error for the column type not existing in the csv file? I will try it now...
I'm getting maximum recursion level reached error. Trying to diagnose now...
Ok, recursion error was on me - too many square brackets. But now getting a lambda error.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.