0

Say I have two lists:

header = ['a', 'b', 'c', 'd']
data_type = ['str', 'str', 'float64', 'float64']

How do I get a combined list like this:

data_type = {'a':str, 'b':str, 'c':float64, 'd':float64}

This is used to define dtype in pd.read_csv method.

4

2 Answers 2

6

Fastest:

header = ['a', 'b', 'c', 'd']
data_type = ['str', 'str', 'float64', 'float64']
dict(zip(header, data_type))

the idea is: two lists merged with zip function(https://docs.python.org/2/library/functions.html#zip) and than produced tuple of tuples converted to dictionary with dict function.

Sign up to request clarification or add additional context in comments.

6 Comments

@JulienMarrec I'm sorry bro. I have no idea how it happened.
@JulienMarrec that cannot be 1 person that's for sure. Maybe because you answered the same thing after Andrey? pandas questions are sometimes harsh for pure python answerers.
@Jean-FrançoisFabre actually at first his answer was only about dict comprehension, and since zip supposed to be more efficien(and maybe a bit more elegant) some users decided that it is not the best way. Anyway his answer is also right and I didn't downvote on it.
I had understood that you didn't. I avoid downvoting "concurrent" answers to mine, unless they're really bad/have misleading/wrong contents. BTW 5 upvotes for using zip that's really a good investment. My guess is that there are a lot of pandas specialists out there that don't know a lot of useful basic python stuff. Congrats.
@Jean-FrançoisFabre yeah, it was a bit unpredicatable :) Anyway I'm glad it is so helpful for folks here.
|
1

From your question it seems you actually want to return a datatype (python keyword) and not a string of a datatype, so I would use a dict comprehension with eval, this is how I would do it:

from numpy import float64
{header[i]:eval(data_type[i]) for i in range(len(header))}

{'a': str, 'b': str, 'c': numpy.float64, 'd': numpy.float64}

Also note that the dict comprehension option is slightly slower, at 1.91 µs per loop versus 1.62 µs per loop for the the dict+zip option

5 Comments

how the hell did this get 3 downvotes in 10 freaking secs?
I upvoted your answer just because it is not fair to get so much downvotes.
I also upvoted your answer because it helps clear a few concepts for me. My second week with python.
@LedgerYu also another way then: {k: data_type[i] for (i, k) in enumerate(header)}. Using enumerate here for me looks more pythonic - and a little bit more transparent
upvoting because it seems to really answer the question. The question is unclear about data type of the str/float64 values.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.